May 17, 2024

Information Theory & AI

 

Claude Shannon's Information Theory provides a theoretical foundation and practical tools that are extensively utilized in artificial intelligence, including data representation, machine learning, data compression, communication, signal processing, generative modeling, and cryptography.


Let’s ask GPT-4 to explain the connection between Information Theory and AI.





May 10, 2024

Information Theory

 


Claude Shannon


As the AI evolution continues to accelerate, it’s important to revisit the past to pay homage to the contributions of the past generations from which the current AI advancement is built upon.  One of those giants is the American mathematician Claude Shannon, who wrote his landmark paper ‘A Mathematical Theory of Communication’, published in the Bell System Technical Journal in 1948.  This research paper transformed the understanding of communication systems by giving a mathematical structure to quantify information and its transmission.


Let’s discuss the important concepts in Shannon’s foundational paper.


1)  Information can be defined as a reduction of uncertainty.  If an event has high probability, its occurrence carries little information.  Whereas if an event has low probability, its occurrence has high information content.  For example, the probability of winning a jackpot is extremely small, so when the news is you win a jackpot, you will be very surprised by the outcome.  That is, information content increases with decreasing probability.


2)  Entropy:  Shannon introduced the concept of entropy to measure the uncertainty or randomness of a message source.  Just like the way we can measure the mass of an object in kilogram, information can be measured or compared using a measurement called entropy.  Entropy is like an information scale that shows how much information a message has.  Higher entropy implies higher unpredictability or information content.


3)  Channel Capacity:  Shannon discussed the notion of channel capacity as the maximum rate at which information can be reliably transmitted over a communication channel.  All communications must happen through a medium from a transmitter to a receiver.  If you’re in a loud room, then your message needs to be very loud and clear in order to be understood.  Whereas if you’re in a quiet room, your message might have some noise but it can still be understood.  Channel capacity is a theoretical limit to the amount of signal that you can pump through any given medium based on how noisy the medium is and the level of entropy in the message you’re trying to convey.  


4)  Data Compression aims to represent information in a more efficient manner by removing redundancy, thereby reducing the amount of data required to represent the same information.  If entropy of the information decreases due to known statistical structure, the ability to compress increases.  Whereas if entropy of the information increases due to unpredictability, the ability to compress decreases.  If we want to compress beyond entropy, we must be willing to throw away unnecessary information in our message.


In the next post, we'll explore the far-reaching impact of Shannon's Information Theory in the field of AI.



May 3, 2024

Diffusion Models vs. GANs

 

Now that diffusion models and GANs have been discussed in my previous posts, let's ask GPT-4 to summarize and compare the differences between these two models.