We present a novel task that involves prediction of linguistic typological features from the World Atlas of Language Structures (WALS) from multilingual speech. We frame this task as a multi-label classification involving predicting the set of non-mutually exclusive and extremely sparse multi-valued WALS features. We investigate whether the speech modality has enough signals for an RNN to reliably discriminate between the typological features for languages which are included in the training data as well as languages withheld from the training. We show that the proposed approach can identify typological features with the overall accuracy of 91.6% for the 16 in-domain and 71.1% for 19 held-out languages. In addition, our approach outperforms language identification-based baselines on all the languages. Also, we show that correctly identifying all the typological features for an unseen language is still a distant goal: for 14 languages out of 19 the prediction error is well above 30%.