Michael Munn

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Explainable AI refers to methods and techniques in artificial intelligence (AI) that allow the results of the model to be explained in terms that are understandable by human experts. Explainability is one of the key components of what is now referred to as Responsible AI alongside ML fairness, security and privacy. A successful XAI system aims to increase trust and transparency for complex ML models in a way that benefits model developers, stakeholders, and users. This book is a collection of some of the most effective and commonly used techniques for explaining why an ML model makes the predictions it does. We discuss the many aspects of Explainable AI including the challenges, metrics for success, and use case studies to guide best practices. Ultimately the goal of this book is to bridge the gap between the vast amount of work that has been done in Explainable AI and provide a quick reference for practitioners that aim to implement XAI into their ML development workflow. View details
    Preview abstract In many contexts, simpler models are preferable to more complex models and the control of this model complexity is the goal for many methods in machine learning such as regularization, hyperparameter tuning and architecture design. In deep learning, it has been difficult to understand the underlying mechanisms of complexity control, since many traditional measures are not naturally suitable for deep neural networks. Here we develop the notion of geometric complexity, which is a measure of the variability of the model function, computed using a discrete Dirichlet energy. Using a combination of theoretical arguments and empirical results, we show that many common training heuristics such as parameter norm regularization, spectral norm regularization, flatness regularization, implicit gradient regularization, noise regularization and the choice of parameter initialization all act to control geometric complexity, providing a unifying framework in which to characterize the behavior of deep learning models. View details
    Preview abstract In over-parameterized deep neural networks there can be many possible parameter configurations that fit the training data exactly. However, the properties of these interpolating solutions are poorly understood. We argue that over-parameterized neural networks trained with stochastic gradient descent are subject to a Geometric Occam's Razor; that is, these networks are implicitly regularized by the geometric model complexity. For one-dimensional regression, the geometric model complexity is simply given by the arc length of the function. For higher-dimensional settings, the geometric model complexity depends on the Dirichlet energy of the function. We explore the relationship between this Geometric Occam's Razor, the Dirichlet energy and other known forms of implicit regularization. Finally, for ResNets trained on CIFAR-10, we observe that Dirichlet energy measurements are consistent with the action of this implicit Geometric Occam's Razor. View details
    Machine Learning Design Patterns
    Lak V Lakshmanan
    Sara Robinson
    O'Reilly Media (2020)
    Preview abstract In engineering disciplines, best practices and solutions to commonly occurring problems are captured in the form of design patterns. Design patterns codify the experience of hundreds of experts into advice that all practitioners can follow. As ML becomes more mainstream, it is important that practitioners take advantage of tried-and-proven methods to address recurring problems. However, there is no collection of proven design patterns in machine learning. This book remedies that. This book is a catalog of design patterns or repeatable solutions to commonly occurring problems in ML engineering. For example, the Transform pattern enforces the separation of inputs, features, and transforms and making the transformations persistent in order to simplify moving an ML model to production. Similarly, Keyed Predictions is a pattern that enables the large scale distribution of batch predictions, such as for recommendation models. For each pattern, we describe the commonly occurring problem that is being addressed and then walk through a variety of potential solutions to the problem, the tradeoffs of these solutions, and then recommend how to choose between these solutions. Implementation code for these solutions is provided in SQL (useful if you are carrying out preprocessing and other ETL in Spark SQL, BigQuery, etc.) and Keras. View details