There are numerous different odorant molecules in nature but only a relatively small number of olfactory receptor neurons (ORNs) in brains. This ``compressed sensing’’ challenge is compounded by the constraint that ORNs are nonlinear sensors with a finite dynamic range. Here, we investigate possible optimal olfactory coding strategies by maximizing mutual information between odor mixtures and ORNs’ responses with respect to the bipartite odor-receptor interaction network (ORIN) characterized by sensitivities between all odorant–ORN pairs. For ORNs without spontaneous (basal) activity, we find that the optimal ORIN is sparse—a finite fraction of sensitives are zero, and the nonzero sensitivities follow a broad distribution that depends on the odor statistics. We show analytically that sparsity in the optimal ORIN originates from a trade-off between the broad tuning of ORNs and possible interference. Furthermore, we show that the optimal ORIN enhances performances of downstream learning tasks (reconstruction and classification). For ORNs with a finite basal activity, we find that having inhibitory odor–receptor interactions increases the coding capacity and the fraction of inhibitory interactions increases with the ORN basal activity. We argue that basal activities in sensory receptors in different organisms are due to the trade-off between the increase in coding capacity and the cost of maintaining the spontaneous basal activity. Our theoretical findings are consistent with existing experiments and predictions are made to further test our theory. The optimal coding model provides a unifying framework to understand the peripheral olfactory systems across different organisms.