Understanding black-box machine learning models can be crucial towards their widespread adoption. In this paper, we propose a novel framework for interpretability - Reinforcement Learning based locally-interpretable Models (RL-LIM). RL-LIM employs reinforcement learning to select a small number of samples to distill into a low-capacity locally-interpretable model. The training is guided with a reward obtained from the agreement of the predictions of the locally-interpretable model with the black-box model. RL-LIM significantly outperforms the state-of-the-art in terms of overall prediction performance and fidelity, consistently across various cases. While almost matching the performance of the black-box models, RL-LIM yields human-like interpretability, along with the most valuable training samples enabling it. Such capability is expected to be beneficial for many artificial intelligence deployments, to understand instance-wise dynamics, to build trust by explaining the constituent components behind the decisions or to enable actionable insights such as manipulating outcomes.