How do languages assign meanings to words? In this talk, I will argue that efficient data compression is a fundamental principle underlying human semantic systems. Specifically, I will argue that languages compress meanings into words by optimizing the Information Bottleneck (IB) tradeoff between the complexity and accuracy of the lexicon, which can be derived from Shannon’s Rate–Distortion theory. This proposal has gained substantial empirical support in a series of recent studies using cross-linguistic data from several semantic domains, such as terms for colors and containers. I will show that (1) semantic systems across languages lie near the IB theoretical limit; (2) the optimal systems explain much of the cross-language variation, and provide a theoretical explanation for why empirically observed patterns of inconsistent naming and soft category boundaries are efficient for communication; (3) languages may evolve through a sequence of structural phase transitions along the IB theoretical limit; and (4) this framework can be used to generate efficient naming systems from artificial neural networks trained for vision, providing a platform for testing the interaction between neural perceptual representations and high-level semantic representations. These findings suggest that efficient compression may be a major force shaping the structure and evolution of human semantic systems, and may help to inform AI systems with human-like semantics.