diff --git a/modules/ROOT/pages/index.adoc b/modules/ROOT/pages/index.adoc index 7596b3a..ed734c0 100644 --- a/modules/ROOT/pages/index.adoc +++ b/modules/ROOT/pages/index.adoc @@ -6,16 +6,16 @@ video::intro_v4.mp4[width=640] Welcome to this quick course on _Serving an LLM using OpenShift AI_. -This program was designed to guide you through the process of installing an OpenShift AI Platform using an OpenShift Container Platform Web Console UI. We get hands-on experience in each component needed to enable a RHOAI Platform using an Openshift Container Platform Cluster. +This program was designed to guide you through the process of installing an OpenShift AI Platform using the OpenShift Container Platform Web Console UI. We get hands-on experience in each component needed to enable a RHOAI Platform using an Openshift Container Platform Cluster. -Once we have an operational OpenShift AI Platform. We will login and begin the process of configuration of: Model Runtimes, Data Science Projects, Data connections, & finally use a jupyter notebook to infer the answers to easy questions. +Once we have an operational OpenShift AI Platform, we will login and begin the configuration of: Model Runtimes, Data Science Projects, Data connections, & finally use a jupyter notebook to infer the answers to easy questions. -There will be some challenges along the way, all designed to teach us about a component, or give us the knowledge utilizing OpenShift AI and hosting an Large Language Model. +There will be some challenges along the way, all designed to teach us about a component, or give us the knowledge needed to utilize OpenShift AI and host a Large Language Model. -If you're ready, let’s get started ! +If you're ready, let’s get started! -IMPORTANT: The hands-on labs in this course were created and tested with RHOAI v2.9.1 & later. Labs will work without any changes in minor dot release upgrades of the product. Please open issues in this repository if you face any issue. +IMPORTANT: The hands-on labs in this course were created and tested with RHOAI v2.9.1 & later versions. Labs will work without any changes in minor dot release upgrades of the product. Please open issues in this repository if you face any problem. == Authors @@ -37,7 +37,7 @@ The PTL team acknowledges the valuable contributions of the following Red Hat as We will use the https://demo.redhat.com/catalog?item=babylon-catalog-prod%2Fopenshift-cnv.ocpmulti-wksp-cnv.prod[*Red Hat OpenShift Container Platform Cluster*] catalog item in the Red Hat Demo Platform (RHDP) to run the hands-on exercises in this course. [TIP] -If you are planning on starting this course now, go ahead & launch the workshop now. It takes <10 minutes to provision, which is just enough time to finish the introduction section. +If you are planning on starting this course now, go ahead & launch the workshop. It takes <10 minutes to provision it, which is just enough time to finish the introduction section. video::openshiftai_demo.mp4[width=640] @@ -75,6 +75,6 @@ The overall objectives of this course include: * Load an LLM model into the Ollama runtime framework - * Import (from git repositories), interact with LLM model via a Jupyter Notebooks + * Import (from git repositories), interact with LLM model via Jupyter Notebooks * Experiment with the Mistral LLM \ No newline at end of file diff --git a/modules/chapter1/pages/index.adoc b/modules/chapter1/pages/index.adoc index 50d16f8..a2d7cc0 100644 --- a/modules/chapter1/pages/index.adoc +++ b/modules/chapter1/pages/index.adoc @@ -4,14 +4,14 @@ [NOTE] This unit of the journey provides context around the technologies we encounter & a few analogies to facilitate understanding the purpose of guided lab in the next section. Feel free to skip ahead if you just want to get started. -=== Why this technical course ? +== Why this technical course? -A Formula One Driver doesn't need to know the how to build an engine to be an F1 champion. However, she/he needs to have a *mechanical sympathy* which is understanding of car's mechanics to drive it effectively and get the best out it. +A Formula One Driver doesn't need to know the how to build an engine to be an F1 champion. However, she/he needs to have a *mechanical sympathy*, which is understanding of the car's mechanics to drive it effectively and get the best out it. -The same applies to AI, we don't need to be AI experts to harness the power of large language models but we have to develop a certain level of *"technological awareness"* with how LLM Models are Trained, Selected, Operationalized, Delivered, Infered from, Fined-Tuned, Augmented and kept up-to-date. Not just as users, but as aficionados who understand the underlying components to effectively communicate with clients, partners, and co-workers. +The same applies to AI. We don't need to be AI experts to harness the power of large language models, but we do need to develop a certain level of *"technological awareness"* about how LLM Models are trained, selected, operationalized, delivered, infered from, fined-Tuned, augmented and kept up-to-date. Not just as users, but as aficionados who understand the underlying components to effectively communicate with clients, partners, and co-workers. The true power lies in the platform that enables us to harness a diverse range of AI models, tools, infrastructure and operationalize our ML projects. -That platform, *OpenShift AI* is what we learn to create, configure, and utilize to Serve LLM Models in this quick course. +That platform, *OpenShift AI*, is what we learn to create, configure, and utilize to Serve LLM Models in this quick course. diff --git a/modules/chapter1/pages/section1.adoc b/modules/chapter1/pages/section1.adoc index 5c0b5a5..1e9ce77 100644 --- a/modules/chapter1/pages/section1.adoc +++ b/modules/chapter1/pages/section1.adoc @@ -4,23 +4,23 @@ OpenShift builds upon Kubernetes by providing an enhanced platform with additional capabilities. It simplifies the deployment and management of Kubernetes clusters while adding enterprise features, developer tools, and security enhancements. -In addition, Openshift provides a Graphic User Interface for Kubernetes. Openshift AI runs on Openshift, therefore, the engine under the hood of both products is Kubernetes. +In addition, Openshift provides a Graphic User Interface for Kubernetes. Openshift AI runs on Openshift; therefore, the engine under the hood of both products is Kubernetes. Most workloads are deployed in kubernetes via YAML files. A Kubernetes Deployment YAML file is a configuration file written in YAML (YAML Ain't Markup Language) that defines the desired state of a Kubernetes Deployment. These YAML file are used to create, update, or delete Deployments in Kubernetes / OpenShift clusters. -Don’t worry about needing to know how to write these files. That's what OpenShift & OpenShift AI will take care of for us. In this course, we will just need to select the options we want in the UI. OpenShift + OpenShift AI will take care of creating the YAML deployment files. +Don’t worry about needing to know how to write these files. That's what OpenShift & OpenShift AI will take care of for us. In this course, we will just need to select the options we want in the UI. OpenShift and OpenShift AI will take care of creating the YAML deployment files. -We will have to perform a few YAML file copy and paste operations, instructions are provided in the course. +We will have to perform a few YAML file copy-and-paste operations; instructions are provided in the course. -Just know, YAML files create resources in the Kubernetes platform directly. We primarily use the OpenShift AI UI to perform these tasks to deliver our LLM. +Just know, YAML files create resources directly in the Kubernetes platform. We primarily use the OpenShift AI UI to perform these tasks to deliver our LLM. == Large Language Models -LLMs - Large Language Models (LLMs) can generate new stories, summarize texts, and even perform advanced tasks like reasoning and problem solving, which is not only impressive but also remarkable due to their accessibility and easy integration into applications. +Large Language Models (LLMs) can generate new stories, summarize texts, and even perform advanced tasks like reasoning and problem solving, which is not only impressive but also remarkable due to their accessibility and easy integration into applications. -As you probably already heard, training large language models is expensive, time consuming, and most importantly requires a vast amount of data fed into the Model. +As you probably already know, training large language models is expensive, time consuming, and most importantly requires a vast amount of data fed into the model. -The common outcome from this training is a Foundation model: this is an LLM designed to generate and understand human-like text across a wide range of use cases. +The common outcome from this training is a Foundation model: an LLM designed to generate and understand human-like text across a wide range of use cases. The key to this powerful language processing architecture, *is the Transformer!* A helpful definition of a *Transformer* is a set of neural networks that consist of an encoder and a decoder with self-attention capabilities. The Transformer was created by Google and started as a language translation algorithm. It analyzes relationships between words in text, which crucial for LLMs to understand and generate language. @@ -47,11 +47,11 @@ There are hundreds of popular LLMs, nonetheless, their operation remains the sam Ollama is not an LLM Model - Ollama is a relatively new but powerful open-source framework designed for serving machine learning models. It's designed to be efficient, scalable, and easy to use; making it an attractive option for developers and organizations looking to deploy their AI models into production. -==== How does Ollama work? +=== How does Ollama work? At its core, Ollama simplifies the process of downloading, installing, and interacting with a wide range of LLMs, empowering users to explore their capabilities without the need for extensive technical expertise or reliance on cloud-based platforms. In this course, we will focus on single LLM, Mistral, run on the Ollama Framework. However, with the understanding of the Ollama Framework, we will be able to work with a variety of large language models utilizing the exact same configuration. -You will be able to switch models in minutes, all running on the same platform. This will enable you test, compare, and evalute multiple models with the skills gained in the course. \ No newline at end of file +You will be able to switch models in minutes, all running on the same platform. This will enable you test, compare, and evalute multiple models with the skills gained in the course. \ No newline at end of file diff --git a/modules/chapter2/pages/section1.adoc b/modules/chapter2/pages/section1.adoc index 783b071..96cf8de 100644 --- a/modules/chapter2/pages/section1.adoc +++ b/modules/chapter2/pages/section1.adoc @@ -15,6 +15,7 @@ This exercise uses the Red Hat Demo Platform; specifically the OpenShift Contain . Navigate to **Operators** -> **OperatorHub** and search for each of the following Operators individually. Click on the button or tile for each. In the pop up window that opens, ensure you select the latest version in the *stable* channel and click on **Install** to open the operator's installation view. For this lab you can skip the installation of the optional operators. [*] You do not have to wait for the previous Operator to complete before installing the next. For this lab you can skip the installation of the optional operators as there is no GPU. +// Should this be a note? * Web Terminal diff --git a/modules/chapter2/pages/section2.adoc b/modules/chapter2/pages/section2.adoc index fa24ffc..ca94712 100644 --- a/modules/chapter2/pages/section2.adoc +++ b/modules/chapter2/pages/section2.adoc @@ -101,7 +101,7 @@ Single Model Serve Platform will now be deployed to expose ingress connections w Congratulations, you have successfully completed the installation of OpenShift AI on an OpenShift Container Cluster. OpenShift AI is now running on a new Dashboard! - * We Installed the required OpenShift AI Operators + * We installed the required OpenShift AI Operators ** Serverless, ServiceMesh, & Pipelines Operators ** OpenShift AI Operator ** Web Terminal Operator diff --git a/modules/chapter3/pages/index.adoc b/modules/chapter3/pages/index.adoc index e638062..b242e60 100644 --- a/modules/chapter3/pages/index.adoc +++ b/modules/chapter3/pages/index.adoc @@ -1,10 +1,10 @@ = OpenShift AI Configuration -This chapter begins with running & configured OpenShift AI environment, if you don't already have your environment running, head over to Chapter 2. +This chapter begins with running and configured OpenShift AI environment. If you don't already have your environment running, head over to Chapter 2. There's a lot to cover in section 1, we add the Ollama custom Runtime, create a data science project, setup storage, create a workbench, and finally serve the Ollama Framework, utilizing the Single Model Serving Platform to deliver our model to our Notebook Application. -In section 2 we will explore using the Jupyter Notebook from our workbench to infere data from the Mistral 7B LLM. While less technical than previous section of this hands-on course, there are some steps to download the Mistral Model, update our notebook with inference endpoint, and evaluate our Models performance. +In section 2, we will explore using the Jupyter Notebook from our workbench to infer data from the Mistral 7B LLM. While less technical than previous section of this hands-on course, there are some steps to download the Mistral Model, update our notebook with inference endpoint, and evaluate our Models performance. Let's get started! \ No newline at end of file diff --git a/modules/chapter3/pages/section1.adoc b/modules/chapter3/pages/section1.adoc index 0b505e8..a7107d4 100644 --- a/modules/chapter3/pages/section1.adoc +++ b/modules/chapter3/pages/section1.adoc @@ -4,13 +4,13 @@ video::openshiftai_setup_part1.mp4[width=640] == Model Serving Runtimes -A model-serving runtime provides integration with a specified model server and the model frameworks that it supports. By default, Red Hat OpenShift AI includes the following Model Run Times: +A model-serving runtime provides integration with a specified model server and the model frameworks that it supports. By default, Red Hat OpenShift AI includes the following Model RunTimes: * OpenVINO Model Server runtime. * Caikit TGIS for KServe * TGIS Standalone for KServe -However, if these runtime do not meet your needs (it they don't support a particular model framework, for example), you might want to add your own custom runtimes. +However, if these runtime do not meet your needs (if they don't support a particular model framework, for example), you might want to add your own custom runtimes. As an administrator, you can use the OpenShift AI interface to add and enable custom model-serving runtimes. You can then choose from your enabled runtimes when you create a new model server. @@ -275,7 +275,7 @@ spec: insecureEdgeTerminationPolicy: Redirect ``` -*This should finish in a few seconds. Now it's time to deploy our storage buckets.* +*This should finish in a few seconds. Now it's time to deploy our storage buckets.* video::openshiftai_setup_part2.mp4[width=640] @@ -289,7 +289,7 @@ From the OCP Dashboard: . For the first step, select the UI route and paste it in a browser Window. - . This window opens the MinIO Dashboard. Log in with user/password combination you set, or the default listed in yaml file above. + . This window opens the MinIO Dashboard. Log in with username/password combination you set, or the default listed in yaml file above. Once logged into the MinIO Console: @@ -367,7 +367,7 @@ image::deploy_model_2.png[width=800] *Create the model server with the following values:* --- + .. Model name: `Ollama-Mistral` .. Serving Runtime: `Ollama` .. Model framework: `Any` @@ -378,6 +378,6 @@ image::deploy_model_2.png[width=800] After clicking the **Deploy** button at the bottom of the form, the model is added to our **Models & Model Server list**. When the model is available, the inference endpoint will populate & the status will indicate a green checkmark. -We are now ready to interact with our newly deployed LLM Model. Join me in Section 2 to explore Mistral running on OpenShift AI using Jupyter Notebooks. +We are now ready to interact with our newly deployed LLM Model. Join me in Section 2 to explore Mistral running on OpenShift AI using Jupyter Notebooks. diff --git a/modules/chapter3/pages/section2.adoc b/modules/chapter3/pages/section2.adoc index 587479a..524c530 100644 --- a/modules/chapter3/pages/section2.adoc +++ b/modules/chapter3/pages/section2.adoc @@ -27,6 +27,7 @@ Explore the notebook, and then continue. === Update the Inference Endpoint Head back to the RHOAI workbench dashboard & copy the interence endpoint from our ollama-mistral model. +// Should it be inference instead of interence? Return the Jupyter Notebook Environment: @@ -43,7 +44,7 @@ image::serverurl.png[width=800] . In the fourth cell, place our first call to the Ollama-Mistral Framework Served by OpenShift AI. [WARNING] -Before we continue, we need to perform the following additional step. As mentioned, The Ollama Model Runtime we launched in OpenShift AI is a Framework that can host multiple LLM Models. It is currently running but is waiting for the command to instruct it to download Model to Serve. The following command needs to run from the OpenShift Dashboard. We are going to use the web_terminal operator to perform this next step. +Before we continue, we need to perform the following additional step. As mentioned, The Ollama Model Runtime we launched in OpenShift AI is a Framework that can host multiple LLM Models. It is currently running but is waiting for the command to instruct it to download Model to Serve. The following command needs to run from the OpenShift Dashboard. We are going to use the web_terminal operator to perform this next step. == Activating the Mistral Model in Ollama @@ -65,15 +66,15 @@ curl https://your-endpoint/api/pull \ . Click on the Start button in the terminal window, wait for the bash..$ prompt to appear . Past the modified code block into the window and press enter. -The message: *status: pulling manifest* should appear. This begins the model downloading process. +The message: *status: pulling manifest* should appear. This begins the model downloading process. image::curl_command.png[width=800] -Once the download completes, the *status: success:* message appears. We can now return to the Jupyter Notebook Tab in the browser and proceed. +Once the download completes, the *status: success:* message appears. We can now return to the Jupyter Notebook Tab in the browser and proceed. === Create the Prompt -This cell sets the *system message* portion of the query to our model. Normally, we don't get the see this part of the query. This message details how the model should act, respond, and consider our questions. It adds checks to valdiate the information is best as possible, and to explain answers in detail. +This cell sets the *system message* portion of the query to our model. Normally, we don't get the see this part of the query. This message details how the model should act, respond, and consider our questions. It adds checks to valdiate the information is best as possible, and to explain answers in detail. == Memory for the conversation @@ -87,7 +88,7 @@ The Notebooks first input to our model askes it to describe Paris in 100 words o In green text is the window, there is the setup message that is sent along with the single sentence question to desctibe to the model how to consider and respond to the question. -It takes approximately 12 seconds for the model to respond with the first word of the reply, and the final word is printed to the screen approximately 30 seconds after the request was started. +It takes approximately 12 seconds for the model to respond with the first word of the reply, and the final word is printed to the screen approximately 30 seconds after the request was started. image::paris.png[width=800] @@ -95,11 +96,11 @@ The responce answered the question in a well-considered and informated paragraph === Second Input -Notice that the Second input - "Is there a River" - does not specify where the location is that might have a River. Because the conversation history is passed with the second input, there is not need to specify any additional informaiton. +Notice that the Second input - "Is there a River" - does not specify where the location is that might have a River. Because the conversation history is passed with the second input, there is not need to specify any additional informaiton. image::london.png[width=800] -The total time to first word took approximately 14 seconds this time, just a bit longer due the orginal information being sent. The time for the entire reponse to be printed to the screen just took over 4 seoncds. +The total time to first word took approximately 14 seconds this time, just a bit longer due the orginal information being sent. The time for the entire reponse to be printed to the screen just took over 4 seoncds. Overall our Model is performing well without a GPU and in a container limited to 4 cpus & 10Gb of memory. @@ -115,7 +116,7 @@ Add a few new cells to the Notebook. image::experiment.png[width=800] -Experiment with clearing the memory statement, then asking the river question again. Or perhaps copy one of the input statements and add your own question for the model. +Experiment with clearing the memory statement, then asking the river question again. Or perhaps copy one of the input statements and add your own question for the model. Try not clearing the memory and asking a few questions. @@ -124,7 +125,7 @@ Try not clearing the memory and asking a few questions. == Delete the Environment -Once you finished experimenting with questions, make sure you head back to the Red Hat Demo Platform and delete the Openshift Container Platform Cluster. +Once you have finished experimenting with questions, make sure you head back to the Red Hat Demo Platform and delete the Openshift Container Platform Cluster. You don't have to remove any of the resources; deleting the environment will remove any resources created during this lesson. @@ -133,7 +134,7 @@ You don't have to remove any of the resources; deleting the environment will rem If you enjoyed this walkthrough, please send the team a note. If you have suggestions to make it better or clarify a point, please send the team a note. -Until the next time, Keep being Awesome! +Until next time, Keep being Awesome!