April 19, 2024

GAN

 



Generative adversarial network (GAN) is a machine learning model used in unsupervised learning where two neural networks compete against each other.  Both networks are trained simultaneously through adversarial training to become more accurate in their predictions.


The architecture of GAN would have one generator network that artificially creates random outputs that look real and one discriminator network that identifies whether the outputs are real or not.  As the training progresses, the generator learns to produce samples that are increasingly difficult for the discriminator to distinguish from the real ones.  At convergence, the generator generates samples that are almost indistinguishable from real data.


GAN model was first written in a research paper ‘Generative Adversarial Nets’ by Ian Goodfellow et al. from the University of Montreal in 2014.  Since then, GAN models have seen their growth in image, video, and text generation because GANs are more focused on generating new samples where previous samples have not existed before.  The major benefit of GANs is that they can be used to create new data outputs where data collection is difficult or impossible. 


Some examples of real-world GAN models include (1) NVIDIA's GANverse3D that generates 3D models from single 2D images, (2) The Fabricant, the digital fashion house that generates digital innovative clothing designs, or (3) This-person-does-not-exist, a website that generates lifelike images of human faces that don't belong to real people.  

 

In the next post, we will explore a different model that also excels at image, video, and text generation.



April 5, 2024

Inference

 



Inference is the process that a trained AI model uses new data to make prediction or to solve a task.  The AI model typically has 2 phases:


1)  The first phase is to train the model or to develop the intelligence by storing, recording, labeling data.  For example, if you’re training a model to identify a stop sign, you would feed the model with thousands of stop sign images so the model can refer to later. 


2)  The second phase is the inference, the AI model’s shining moment to prove that its intelligence developed during training can make a right prediction or solve a task.  During inference, the model applies its learned knowledge to real data to provide accurate predictions or generate outputs, such as images, text, or video.  This allows businesses to make real-time data-driven decisions and improve efficiency.



Inferencing is very expensive


Both training and inferencing are computationally expensive, however, training is more or less a one-time compute.  On another hand, inferencing is on-going, every time a user asks a question on a LLM and expects an answer, that’s inferencing.


Now multiple that by millions of users with millions of questions, you can imagine the huge compute cost that would incur on the AI system.  In fact, up to 90% of the AI model’s life might be spent in inference mode.  Inferencing is by an order of magnitude more expensive computationally than training.


March 29, 2024

AI Woodstock

 



Last week, I attended the NVIDIA GTC conference which was heralded as ‘AI Woodstock'.  The conference was a whirlwind of technology, academia, media, and business in an incredible exchange of AI visions, progresses, and achievements.  It was one big schmooze time!




Selfie with NVIDIA CEO Jensen Huang.





California Governor Gavin Newsom conversed with NVIDIA CEO.



Pivotal technical achievement:  NVIDIA has accelerated the computing power model so much that Moore’s Law might have expired.  For the past 4 decades, Moore’s Law stated that the computing power of the semiconductor chips would essentially double every two years, 10 times in five years, or 100 times in 10 years with minimal rise in cost. 


But NVIDIA’s integration of powerful GPUs (latest Blackwell GPU packed with 208 billion transistors), innovative CUDA software platform, and advanced networking has eclipsed Moore’s Law. Over the last 8 years, NVIDIA has increased the computing power of its GPUs by a massive 1,000 times!





ServiceNow + NVIDIA



Strategic partnership:  ServiceNow and NVIDIA have expanded the business partnership by having ServiceNow as one of the first platform providers to use NVIDIA NIM inference microservices to enable faster and more cost-effective LLM deployment. 


ServiceNow, the leading digital workflow automation company, is on the forefront of embracing generative AI to make its family of premier products work even better for the customers.  Please visit ServiceNow website to know more about its AI-powered products.



March 22, 2024

AI Types (by Functionalities)

 



AI systems can also be classified into different types based on functionalities.  The 4 types are:


1) Reactive Machines:  These are the oldest form of AI systems that don’t store memories.  They can't use past experiences to determine future actions and work only with present data, so they don’t have the ability to learn.  They’re task-specific and don’t have capabilities beyond those tasks.  An example of reactive machine is the IBM’s Deep Blue that beat a chess master in late 1990’s.


2) Limited Memory Machines:  These AI systems can learn from past and present data, events, or outcomes to make decisions.  But this data isn’t saved into the systems’ memory as experiences to learn from over a long-term period.  These systems become smarter as they get trained on more data.  Currently, almost all existing AI systems fall under this category.

Example systems include generative AI tools such as ChatGPT, Gemini, or Claude that rely on limited memory AI capabilities to predict the next word, phrase or visual element within the content they’re generating.


3) Theory of Mind:  These theoretical advanced AI systems could have the ability to understand other people’s emotions, sentiments, and thoughts.  In turn, this will affect how they behave in relation to those around them, i.e., as people’s emotions and thoughts change, the AI systems will adapt and behave in relation with the changes.  There’s no real-world example yet as these systems are only theoretical for now.


4) Self-Aware AI:  This is the final stage of AI evolution where machines have a sense of self-awareness, a conscious understanding of their existence.  These AI systems advance beyond the theory of mind that understands other people’s emotions to sensing or predicting others’ feelings and having emotions, needs, and beliefs of their own.  For example, they can feel “I’m hungry” to “I know I’m hungry” or “I like pizza because it’s my favorite food.”


The development of self-aware is the pinnacle of AI evolution which could advance our civilization tremendously.  However, it could also have the opposite effect because once achieving self-aware, the AI systems might understand the concept like self-preservation, outmaneuver the control of human beings, and plan cunning deception to take over humanity.


March 15, 2024

AI Types (by Capabilities)

 



As the artificial intelligence (AI) evolution continues its advance to mimic human intelligence, let’s look at the 3 different AI types based on capabilities:


1) Narrow AI:  These AI systems were designed and trained for specific tasks and can only perform those tasks.  Since these systems can’t function outside of their training models or defined tasks, they’re also known as weak AI.  These systems are the only type of AI available today.  Some examples of narrow AI include LLM such as ChatGPT, or virtual assistant such as Apple Siri, Amazon Alexa, Google Assistant.


2) General AI:  These AI systems can learn, understand, and function like humans can, just like a super-smart human being.  Since these systems can teach themself to learn new tasks without the need to retrain the underlying model, they’re also known as strong AI.  Currently, general AI is only theoretical, but possible examples could be machines with full reasoning capability.


3) Superintelligent AI:  These AI systems outperform humans at any tasks such as thinking, reasoning, learning, or making judgements.  Superintelligent AI comes after general AI so these systems are strictly theoretical, but possible examples could be robots with their own needs, emotions, desires, and beliefs.  This futuristic and speculative scenario is when technology reaches the point of singularity where its growth might become uncontrollable and irreversible.


Even though the AI field has grown rapidly, we're still in the weak AI phase.  This means there's still a long way to get to the more advanced forms of AI.  In the next post, we'll look into different AI types by functionalities.



March 8, 2024

Globe Explorer

 

Imagine an application that combines LLM’s AI algorithm + Google’s search engine + Wikipedia’s encyclopedia.  What do you get? 


Introducing Globe Explorer, a discovery engine that searches and presents information with a table-of-content (similar to Wikipedia’s) and links to articles (similar to Google’s), but with visually appealing pictures.  Globe Explorer uses LLMs to understand your prompt and generate a comprehensive page on that topic categorically and visually.


Let's dig in.  Prompt Globe Explorer with 'machine learning' and you'll get an extensive table-of-content to the left and visual links under different topics to the right.





Globe Explorer is not limited to technical subjects but can handle any topics from pop stars like 'Jennifer Lopez', 





to the art like 'abstract art'.





Using Globe Explorer can be addictive because of its captivating user experience.  Globe Explorer is fast, fun, and powerful, so with a little bit of refining the logo and the landing page, it has the potential to be the next level search engine.  In this age of AI, rapid innovation is the norm and vast improvement over existing products can happen overnight.



March 1, 2024

RAG

 

If you use LLMs like ChatGPT, you might find that on some days ChatGPT provides excellent answers with accuracy, but on other days churns out arbitrary answers with hallucination.  This is an inherent problem of data inconsistency in LLMs, which might be caused by out-of-date information that LLMs were trained on.


Enter Retrieval-Augmented Generation (RAG), a solution that connects LLMs to a data store and supplements LLMs with knowledge that can be either open-source like the Internet or closed-source like a collection of documents.





RAG was first proposed in Facebook's 2020 research paper “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks” by Patrick Lewis et al., as an enhancing technique of the transformer model.  Some benefits of the RAG framework include:  (1) Avoid retraining LLMs, instead augmenting LLMs with updated information, (2) Data can be updated without incurring significant costs, and (3) Provide references for the information sources.


Let’s ask GPT-4 to describe the difference between RAG and a transformer-based language model.








Currently, RAG is still in its early phase.  It's a cost-effective way to enhance LLMs' capability for relevant, up-to-date information without retraining the model.  Implementing RAG can increase user trust (information sources can be verified) & improve user experience (more accurate answers and fewer hallucinations).  Can't wait to see how RAG will contribute to the evolution of AI in the years to come.