April 5, 2024

Inference

 



Inference is the process that a trained AI model uses new data to make prediction or to solve a task.  The AI model typically has 2 phases:


1)  The first phase is to train the model or to develop the intelligence by storing, recording, labeling data.  For example, if you’re training a model to identify a stop sign, you would feed the model with thousands of stop sign images so the model can refer to later. 


2)  The second phase is the inference, the AI model’s shining moment to prove that its intelligence developed during training can make a right prediction or solve a task.  During inference, the model applies its learned knowledge to real data to provide accurate predictions or generate outputs, such as images, text, or video.  This allows businesses to make real-time data-driven decisions and improve efficiency.



Inferencing is very expensive


Both training and inferencing are computationally expensive, however, training is more or less a one-time compute.  On another hand, inferencing is on-going, every time a user asks a question on a LLM and expects an answer, that’s inferencing.


Now multiple that by millions of users with millions of questions, you can imagine the huge compute cost that would incur on the AI system.  In fact, up to 90% of the AI model’s life might be spent in inference mode.  Inferencing is by an order of magnitude more expensive computationally than training.



March 29, 2024

AI Woodstock

 



Last week, I attended the NVIDIA GTC conference which was heralded as ‘AI Woodstock'.  The conference was a whirlwind of technology, academia, media, and business in an incredible exchange of AI visions, progresses, and achievements.  It was one big schmooze time!




Selfie with NVIDIA CEO Jensen Huang.





California Governor Gavin Newsom conversed with NVIDIA CEO.



Pivotal technical achievement:  NVIDIA has accelerated the computing power model so much that Moore’s Law might have expired.  For the past 4 decades, Moore’s Law stated that the computing power of the semiconductor chips would essentially double every two years, 10 times in five years, or 100 times in 10 years with minimal rise in cost. 


But NVIDIA’s integration of powerful GPUs (latest Blackwell GPU packed with 208 billion transistors), innovative CUDA software platform, and advanced networking has eclipsed Moore’s Law. Over the last 8 years, NVIDIA has increased the computing power of its GPUs by a massive 1,000 times!





ServiceNow + NVIDIA



Strategic partnership:  ServiceNow and NVIDIA have expanded the business partnership by having ServiceNow as one of the first platform providers to use NVIDIA NIM inference microservices to enable faster and more cost-effective LLM deployment. 


ServiceNow, the leading digital workflow automation company, is on the forefront of embracing generative AI to make its family of premier products work even better for the customers.  Please visit ServiceNow web site to know more about its AI-powered products.



March 22, 2024

AI Types (by Functionalities)

 



AI systems can also be classified into different types based on functionalities.  The 4 types are:


1) Reactive Machines:  These are the oldest form of AI systems that don’t store memories.  They can't use past experiences to determine future actions and work only with present data, so they don’t have the ability to learn.  They’re task-specific and don’t have capabilities beyond those tasks.  An example of reactive machine is the IBM’s Deep Blue that beat a chess master in late 1990’s.


2) Limited Memory Machines:  These AI systems can learn from past and present data, events, or outcomes to make decisions.  But this data isn’t saved into the systems’ memory as experiences to learn from over a long-term period.  These systems become smarter as they get trained on more data.  Currently, almost all existing AI systems fall under this category.

Example systems include generative AI tools such as ChatGPT, Gemini, or Claude that rely on limited memory AI capabilities to predict the next word, phrase or visual element within the content they’re generating.


3) Theory of Mind:  These theoretical advanced AI systems could have the ability to understand other people’s emotions, sentiments, and thoughts.  In turn, this will affect how they behave in relation to those around them, i.e., as people’s emotions and thoughts change, the AI systems will adapt and behave in relation with the changes.  There’s no real-world example yet as these systems are only theoretical for now.


4) Self-Aware AI:  This is the final stage of AI evolution where machines have a sense of self-awareness, a conscious understanding of their existence.  These AI systems advance beyond the theory of mind that understands other people’s emotions to sensing or predicting others’ feelings and having emotions, needs, and beliefs of their own.  For example, they can feel “I’m hungry” to “I know I’m hungry” or “I like pizza because it’s my favorite food.”


The development of self-aware is the pinnacle of AI evolution which could advance our civilization tremendously.  However, it could also have the opposite effect because once achieving self-aware, the AI systems might understand the concept like self-preservation, outmaneuver the control of human beings, and plan cunning deception to take over humanity.


March 15, 2024

AI Types (by Capabilities)

 



As the artificial intelligence (AI) evolution continues its advance to mimic human intelligence, let’s look at the 3 different AI types based on capabilities:


1) Narrow AI:  These AI systems were designed and trained for specific tasks and can only perform those tasks.  Since these systems can’t function outside of their training models or defined tasks, they’re also known as weak AI.  These systems are the only type of AI available today.  Some examples of narrow AI include LLM such as ChatGPT, or virtual assistant such as Apple Siri, Amazon Alexa, Google Assistant.


2) General AI:  These AI systems can learn, understand, and function like humans can, just like a super-smart human being.  Since these systems can teach themself to learn new tasks without the need to retrain the underlying model, they’re also known as strong AI.  Currently, general AI is only theoretical, but possible examples could be machines with full reasoning capability.


3) Superintelligent AI:  These AI systems outperform humans at any tasks such as thinking, reasoning, learning, or making judgements.  Superintelligent AI comes after general AI so these systems are strictly theoretical, but possible examples could be robots with their own needs, emotions, desires, and beliefs.  This futuristic and speculative scenario is when technology reaches the point of singularity where its growth might become uncontrollable and irreversible.


Even though the AI field has grown rapidly, we're still in the weak AI phase.  This means there's still a long way to get to the more advanced forms of AI.  In the next post, we'll look into different AI types by functionalities.



March 8, 2024

Globe Explorer

 

Imagine an application that combines LLM’s AI algorithm + Google’s search engine + Wikipedia’s encyclopedia.  What do you get? 


Introducing Globe Explorer, a discovery engine that searches and presents information with a table-of-content (similar to Wikipedia’s) and links to articles (similar to Google’s), but with visually appealing pictures.  Globe Explorer uses LLMs to understand your prompt and generate a comprehensive page on that topic categorically and visually.


Let's dig in.  Prompt Globe Explorer with 'machine learning' and you'll get an extensive table-of-content to the left and visual links under different topics to the right.





Globe Explorer is not limited to technical subjects but can handle any topics from pop stars like 'Jennifer Lopez', 





to the art like 'abstract art'.





Using Globe Explorer can be addictive because of its captivating user experience.  Globe Explorer is fast, fun, and powerful, so with a little bit of refining the logo and the landing page, it has the potential to be the next level search engine.  In this age of AI, rapid innovation is the norm and vast improvement over existing products can happen overnight.



March 1, 2024

RAG

 

If you use LLMs like ChatGPT, you might find that on some days ChatGPT provides excellent answers with accuracy, but on other days churns out arbitrary answers with hallucination.  This is an inherent problem of data inconsistency in LLMs, which might be caused by out-of-date information that LLMs were trained on.


Enter Retrieval-Augmented Generation (RAG), a solution that connects LLMs to a data store and supplements LLMs with knowledge that can be either open-source like the Internet or closed-source like a collection of documents.





RAG was first proposed in Facebook's 2020 research paper “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks” by Patrick Lewis et al., as an enhancing technique of the transformer model.  Some benefits of the RAG framework include:  (1) Avoid retraining LLMs, instead augmenting LLMs with updated information, (2) Data can be updated without incurring significant costs, and (3) Provide references for the information sources.


Let’s ask GPT-4 to describe the difference between RAG and a transformer-based language model.








Currently, RAG is still in its early phase.  It's a cost-effective way to enhance LLMs' capability for relevant, up-to-date information without retraining the model.  Implementing RAG can increase user trust (information sources can be verified) & improve user experience (more accurate answers and fewer hallucinations).  Can't wait to see how RAG will contribute to the evolution of AI in the years to come.


February 23, 2024

Adversarial AI

 

With all the innovations and wonders that AI is going to unfurl on our lives, let’s not forget that AI can also be used adversarially to harm against us.  Bad actors might exploit the vulnerabilities in AI systems to disrupt and devastate the real intention of the systems.


Some examples of adversarial AI attacks include:


1)  Image recognition in automobiles might be corrupted to misinterpret a ‘Stop’ sign as a ‘Yield’ sign and might cause an accident.


2)  Algorithm of financial system might be manipulated to cause stock market crash and destabilize economy.


3)  Cybersecurity attack on corporate IT systems to disrupt the company’s daily operation.



The MITRE organization, a consortium by government, industry, and academia, has prepared ATLAS™ (Adversarial Threat Landscape for Artificial-Intelligence Systems) as a comprehensive knowledge base of adversary techniques on AI systems.  MITRE's objective is to increase awareness of the evolving vulnerabilities that might exist in AI systems.






It's concerning that there're so many ways bad actors can exploit AI systems for nefarious purposes.  As a result, AI developers need to be aware of evolving vulnerabilities and take important steps to ensure their AI models are built with strong safety protocol.