Skip to content

Commit

Permalink
Merge pull request #3 from RedHatQuickCourses/Introduction-Updates
Browse files Browse the repository at this point in the history
Introduction Section Updates
  • Loading branch information
kknoxrht authored Jun 6, 2024
2 parents f804878 + 867b03a commit ec84037
Show file tree
Hide file tree
Showing 6 changed files with 93 additions and 50 deletions.
38 changes: 24 additions & 14 deletions modules/ROOT/pages/index.adoc
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
= Serving LLM Models on OpenShift AI
:navtitle: Home

Welcome to this Quick course on _Deploying an LLM using OpenShift AI_. This is the first of a set of advanced courses about Red Hat OpenShift AI:
Welcome to this quick course on _Serving an LLM using OpenShift AI_.

IMPORTANT: The hands-on labs in this course were created and tested with RHOAI v2.9.1. Labs should mostly work without any changes in minor dot release upgrades of the product. Please open issues in this repository if you face any issue.
This program was designed to guide you through the process of installing an OpenShift AI Platform using an OpenShift Container Platform Web Console UI. We get hands-on experience in each component needed to enable a RHOAI Platform using an Openshift Container Platform Cluster.

Once we have an operational OpenShift AI Platform. We will login and begin the process of configuration of: Model Runtimes, Data Science Projects, Data connections, & finally use a jupyter notebook to infer the answers to easy questions.

There will be some challenges along the way, all designed to teach us about a component, or give us the knowledge utilizing OpenShift AI and hosting an Large Language Model.

If you're ready, let’s get started !


IMPORTANT: The hands-on labs in this course were created and tested with RHOAI v2.9.1 & later. Labs will work without any changes in minor dot release upgrades of the product. Please open issues in this repository if you face any issue.


== Authors
Expand All @@ -18,37 +27,38 @@ The PTL team acknowledges the valuable contributions of the following Red Hat as

== Classroom Environment

This introductory course has a few, simple hands-on labs. You will use the Base RHOAI on AWS catalog item in the Red Hat Demo Platform (RHDP) to run the hands-on exercises in this course.
We will use the https://demo.redhat.com/catalog?item=babylon-catalog-prod%2Fopenshift-cnv.ocpmulti-wksp-cnv.prod[*Red Hat OpenShift Container Platform Cluster*] catalog item in the Red Hat Demo Platform (RHDP) to run the hands-on exercises in this course.

This course will utlize the *Red Hat OpenShift Container Platform Cluster*.
[TIP]
If you are planning on starting this course now, go ahead & launch the workshop now. It takes <10 minutes to provision, which is just enough time to finish the introduction section.

When ordering this catalog item in RHDP:

* Select Practice/Enablement for the Activity field
. Select Practice/Enablement for the Activity field

* Select Learning about the Product for the Purpose field
. Select Learning about the Product for the Purpose field

* Enter Learning RHOAI in the Salesforce ID field
. Enter Learning RHOAI in the Salesforce ID field

* Scroll to the bottom, check the box to confirm acceptance of terms and conditions
. Scroll to the bottom, check the box to confirm acceptance of terms and conditions

* Click order
. Click order

For Red Hat partners who do not have access to RHDP, provision an environment using the Red Hat Hybrid Cloud Console. Unfortunately, the labs will NOT work on the trial sandbox environment. You need to provision an OpenShift AI cluster on-premises, or in the supported cloud environments by following the product documentation at Product Documentation for Red Hat OpenShift AI 2024.
For Red Hat partners who do not have access to RHDP, provision an environment using the Red Hat Hybrid Cloud Console. Unfortunately, the labs will NOT work on the trial sandbox environment. You need to provision an OpenShift AI cluster on-premises, or in the supported cloud environments by following the product documentation at https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2.9/html/installing_and_uninstalling_openshift_ai_self-managed/index[Product Documentation for Red Hat OpenShift AI 2024].

== Prerequisites

For this course, basic experience with Red Hat OpenShift is recommended but is not mandatory.
* Basic experience with Red Hat OpenShift Container Platform is recommended but is not mandatory.

You will encounter & modify code segments, deploy resources using Yaml files, and have to modify launch configurations, but you will not have to write code.
* We will encounter & modify code segments, deploy resources using YAML files, & modify launch configurations; but we will not have to write code.

== Objectives

The overall objectives of this introductory course include:
The overall objectives of this course include:

* Familiarize utilizing Red Hat OpenShift AI to Serve & Interact with an LLM.

* Installing Red Hat OpenShift AI Operator & Dependencies
* Installing Red Hat OpenShift AI Operators & Dependencies

* Add a custom Model Serving Runtime

Expand Down
3 changes: 2 additions & 1 deletion modules/chapter1/nav.adoc
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
* xref:index.adoc[]
* xref:index.adoc[]
** xref:section1.adoc[]
31 changes: 3 additions & 28 deletions modules/chapter1/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,41 +2,16 @@


[NOTE]
This segment of the course provides context to know & analogies to guide us to comprehend the purpose of guided lab in the next section. Feel free to skip ahead if you just want to get started.
This unit of the journey provides context around the technologies we encounter & a few analogies to facilitate understanding the purpose of guided lab in the next section. Feel free to skip ahead if you just want to get started.

=== Why this technical course ?

Previously, read a post on LinkenIn and felt it summed up the why quite nicely.
A Formula One Driver doesn't need to know the how to build an engine to be an F1 champion. However, she/he needs to have a *mechanical sympathy* which is understanding of car's mechanics to drive it effectively and get the best out it.

It described the basic idea that a Formula One Driver doesn't need to know the how to build an engine to be an F1 champion. However, she/he needs to have a *mechanical sympathy* which is understanding of car's mechanics to drive it effectively and get the best out it.

The same applies to AI, we don't need to be AI experts to harness the power of large language models but we to develop a certain level of "mechanical sympathy" with how these Models are Selected, Operationized, Served, Infered from, and kept up to date, to work with AI in harmony. Not just as users, but as collaborators who understand the underlying mechanics to communicate with clients, partners, and co-workers effectively.

It's not just about the Model itself, it's about the platform that empowers us to create trushtworthy AI applications and guides us in making informed choices.
The same applies to AI, we don't need to be AI experts to harness the power of large language models but we have to develop a certain level of *"technological awareness"* with how LLM Models are Trained, Selected, Operationalized, Delivered, Infered from, Fined-Tuned, Augmented and kept up-to-date. Not just as users, but as aficionados who understand the underlying components to effectively communicate with clients, partners, and co-workers.

The true power lies in the platform that enables us to harness a diverse range of AI models, tools, infrastructure and operationalize our ML projects.

That platform, *OpenShift AI* is what we learn to create, configure, and utilize to Serve LLM Models in this quick course.


=== The Ollama Model Framework

LLMs - Large Language Models (LLMs) can generate new stories, summarize texts, and even performing advanced tasks like reasoning and problem solving, which is not only impressive but also remarkable due to their accessibility and easy integration into applications.

There are a lot of popular LLMs, Nonetheless, their operation remains the same: users provide instructions or tasks in natural language, and the LLM generates a response based on what the model "thinks" could be the continuation of the prompt.

Ollama is not an LLM Model - Ollama is a relatively new but powerful open-source framework designed for serving machine learning models. It's designed to be efficient, scalable, and easy to use, making it an attractive option for developers and organizations looking to deploy their AI models into production.

==== How does Ollama work?


*At its core, Ollama simplifies the process of downloading, installing, and interacting with a wide range of LLMs, empowering users to explore their capabilities without the need for extensive technical expertise or reliance on cloud-based platforms.

In this course, we will focus on single LLM, Mistral. However, with the understanding of the Ollama Framework, we will be able to work with a variety of large language models utilizing the exact same configuration.

You be able to switch models in minutes, all running on the same platform. This will enable you test, compare, and evalute multiple models with the skills gained in the course.

*Experimentation and Learning*

Ollama provides a powerful platform for experimentation and learning, allowing users to explore the capabilities and limitations of different LLMs, understand their strengths and weaknesses, and develop skills in prompt engineering and LLM interaction. This hands-on approach fosters a deeper understanding of AI technology and empowers users to push the boundaries of what’s possible.*

57 changes: 56 additions & 1 deletion modules/chapter1/pages/section1.adoc
Original file line number Diff line number Diff line change
@@ -1,2 +1,57 @@
= Follow up Story
= Technology Components

== Kubernetes & OpenShift

OpenShift builds upon Kubernetes by providing an enhanced platform with additional capabilities. It simplifies the deployment and management of Kubernetes clusters while adding enterprise features, developer tools, and security enhancements.

In addition, Openshift provides a Graphic User Interface for Kubernetes. Openshift AI runs on Openshift, therefore, the engine under the hood of both products is Kubernetes.

Most workloads are deployed in kubernetes via YAML files. A Kubernetes Deployment YAML file is a configuration file written in YAML (YAML Ain't Markup Language) that defines the desired state of a Kubernetes Deployment. These YAML file are used to create, update, or delete Deployments in Kubernetes / OpenShift clusters.

Don’t worry about needing to know how to write these files. That's what OpenShift & OpenShift AI will take care of for us. In this course, we will just need to select the options we want in the UI. OpenShift + OpenShift AI will take care of creating the YAML deployment files.

We will have to perform a few YAML file copy and paste operations, instructions are provided in the course.

Just know, YAML files create resources in the Kubernetes platform directly. We primarily use the OpenShift AI UI to perform these tasks to deliver our LLM.

== Large Language Models

LLMs - Large Language Models (LLMs) can generate new stories, summarize texts, and even perform advanced tasks like reasoning and problem solving, which is not only impressive but also remarkable due to their accessibility and easy integration into applications.

As you probably already heard, training large language models is expensive, time consuming, and most importantly requires a vast amount of data fed into the Model.

The common outcome from this training is a Foundation model: this is an LLM designed to generate and understand human-like text across a wide range of use cases.

The key to this powerful language processing architecture, *is the Transformer!* A helpful definition of a *Transformer* is a set of neural networks that consist of an encoder and a decoder with self-attention capabilities. The Transformer was created by Google and started as a language translation algorithm. It analyzes relationships between words in text, which crucial for LLMs to understand and generate language.

This is how LLMs are able to predict the next words, by using the transformer neural network & attention mechanism to focus in on keywords to determine context. Then use that context and _knowledge_ from all the data ingested to predict the next word after a sequence of words.

=== Modifications to LLMs

As mentioned above, LLMs are normally large, require Graphics Cards, and costly compute resources to load the model into memory.

However, there are techniques for compressing large LLM models, making them smaller and faster to run on devices with limited resources.

* Quantization reduces the precision of numerical representations in large language models to make them more memory-efficient during deployment.

* Reducing the precision of LLM parameters to save computational resources without sacrificing performance. Trimming surplus connections or parameters to make LLMs smaller and faster yet performant.

In this course, we will be using a quantized version of the Mistral Large Language Model. Instead of requiring 24Gb of memory and Graphics processing unit to simulate the neural network, we are going to run our model with 4 CPUs and 8GB of ram, burstable to 8 CPU with 10Gb ram max.

[NOTE]
https://www.redhat.com/en/topics/ai/what-is-instructlab[*InstructLabs*]- runs locally on laptops uses this same type of quantized LLMs, Both the Granite & Mixtral Large Language Models are reduced in precision to operate on a laptop.

== The Ollama Model Framework

There are hundreds of popular LLMs, nonetheless, their operation remains the same: users provide instructions or tasks in natural language, and the LLM generates a response based on what the model "thinks" could be the continuation of the prompt.

Ollama is not an LLM Model - Ollama is a relatively new but powerful open-source framework designed for serving machine learning models. It's designed to be efficient, scalable, and easy to use; making it an attractive option for developers and organizations looking to deploy their AI models into production.

==== How does Ollama work?


At its core, Ollama simplifies the process of downloading, installing, and interacting with a wide range of LLMs, empowering users to explore their capabilities without the need for extensive technical expertise or reliance on cloud-based platforms.

In this course, we will focus on single LLM, Mistral, run on the Ollama Framework. However, with the understanding of the Ollama Framework, we will be able to work with a variety of large language models utilizing the exact same configuration.

You will be able to switch models in minutes, all running on the same platform. This will enable you test, compare, and evalute multiple models with the skills gained in the course.
12 changes: 6 additions & 6 deletions modules/chapter2/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,15 @@ For information about OpenShift AI as self-managed software on your OpenShift cl

In this course we cover installation of *Red Hat OpenShift AI self-managed* using the OpenShift Web Console.

== General Information about Installation
== Applicable Operators


[INFO]
====
The product name has been recently changed to *Red{nbsp}Hat OpenShift AI (RHOAI)* (old name *Red{nbsp}Hat OpenShift Data Science*). In this course, most references to the product use the new name. However, references to some UI elements might still use the previous name.
====

In addition to the *Red{nbsp}Hat OpenShift AI* Operator there are some other operators that you may need to install depending on which features and components of *Red{nbsp}Hat OpenShift AI* you want to install and use.


https://www.redhat.com/en/technologies/cloud-computing/openshift/pipelines[Red{nbsp}Hat OpenShift Pipelines Operator]::
The *Red{nbsp}Hat OpenShift Pipelines Operator* is required if you want to install the *Red{nbsp}Hat OpenShift AI Pipelines* component.
In addition to the *Red{nbsp}Hat OpenShift AI* Operator there are additional operators that you may need to install depending on which features and components of *Red{nbsp}Hat OpenShift AI* you want to utilize.


[NOTE]
Expand All @@ -37,6 +33,10 @@ The *OpenShift Serveless Operator* is a prerequisite for the *Single Model Servi
https://docs.openshift.com/container-platform/latest/hardware_enablement/psap-node-feature-discovery-operator.html[OpenShift Service Mesh Operator]::
The *OpenShift Service Mesh Operator* is a prerequisite for the *Single Model Serving Platform*.

https://www.redhat.com/en/technologies/cloud-computing/openshift/pipelines[Red{nbsp}Hat OpenShift Pipelines Operator]::
The *Red{nbsp}Hat OpenShift Pipelines Operator* is a prerequisite for the *Single Model Serving Platform*.



[NOTE]
====
Expand Down
2 changes: 2 additions & 0 deletions modules/chapter2/pages/section1.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@

IMPORTANT: The installation requires a user with the _cluster-admin_ role

This exercise uses the Red Hat Demo Platform; specifically the OpenShift Container Cluster Platform Resource. If you haven't alreayd your need to launch the lab environment before continuing.

. Login to the Red Hat OpenShift using a user which has the _cluster-admin_ role assigned.

. Navigate to **Operators** -> **OperatorHub** and search for each of the following Operators individually. Click on the button or tile for each. In the pop up window that opens, ensure you select the latest version in the *stable* channel and click on **Install** to open the operator's installation view. For this lab you can skip the installation of the optional operators
Expand Down

0 comments on commit ec84037

Please sign in to comment.