From 58f5e11b1f3ec58ec9e29c58a09458ace643f0da Mon Sep 17 00:00:00 2001 From: Rutuja Deshmukh Date: Mon, 10 Jun 2024 15:02:55 +0530 Subject: [PATCH] Content edits suggested for this training --- modules/ROOT/pages/index.adoc | 6 +-- modules/chapter1/pages/index.adoc | 23 +++++----- modules/chapter2/pages/index.adoc | 8 ++-- modules/chapter2/pages/section1.adoc | 4 +- modules/chapter2/pages/section2.adoc | 14 +++--- modules/chapter3/pages/index.adoc | 6 +-- modules/chapter3/pages/section1.adoc | 68 ++++++++++++++-------------- modules/chapter3/pages/section2.adoc | 46 +++++++++---------- package-lock.json | 2 +- 9 files changed, 90 insertions(+), 87 deletions(-) diff --git a/modules/ROOT/pages/index.adoc b/modules/ROOT/pages/index.adoc index 4f28605..a37e899 100644 --- a/modules/ROOT/pages/index.adoc +++ b/modules/ROOT/pages/index.adoc @@ -1,7 +1,7 @@ = Serving LLM Models on OpenShift AI :navtitle: Home -Welcome to this Quick course on _Deploying an LLM using OpenShift AI_. This is the first of a set of advanced courses about Red Hat OpenShift AI: +Welcome to this Quick course on _Deploying an LLM using OpenShift AI_. This is the first of a set of advanced courses about Red Hat OpenShift AI. IMPORTANT: The hands-on labs in this course were created and tested with RHOAI v2.9.1. Labs should mostly work without any changes in minor dot release upgrades of the product. Please open issues in this repository if you face any issue. @@ -30,7 +30,7 @@ When ordering this catalog item in RHDP: * Enter Learning RHOAI in the Salesforce ID field - * Scroll to the bottom, check the box to confirm acceptance of terms and conditions + * Scroll to the bottom, and check the box to confirm acceptance of terms and conditions * Click order @@ -48,7 +48,7 @@ The overall objectives of this introductory course include: * Familiarize utilizing Red Hat OpenShift AI to Serve & Interact with an LLM. - * Installing Red Hat OpenShift AI Operator & Dependencies + * Install Red Hat OpenShift AI Operator & Dependencies * Add a custom Model Serving Runtime diff --git a/modules/chapter1/pages/index.adoc b/modules/chapter1/pages/index.adoc index fdf86d9..9a7ab28 100644 --- a/modules/chapter1/pages/index.adoc +++ b/modules/chapter1/pages/index.adoc @@ -6,37 +6,38 @@ This segment of the course provides context to know & analogies to guide us to c === Why this technical course ? -Previously, read a post on LinkenIn and felt it summed up the why quite nicely. +Previously, I read a post on LinkedIn and felt it summed up the +'why' quite nicely. -It described the basic idea that a Formula One Driver doesn't need to know the how to build an engine to be an F1 champion. However, she/he needs to have a *mechanical sympathy* which is understanding of car's mechanics to drive it effectively and get the best out it. +It described the basic idea that a Formula One Driver doesn't need to know the how to build an engine to be an F1 champion. However, they needs to have a *mechanical sympathy*, which is understanding of car's mechanics to drive it effectively and get the best out it. -The same applies to AI, we don't need to be AI experts to harness the power of large language models but we to develop a certain level of "mechanical sympathy" with how these Models are Selected, Operationized, Served, Infered from, and kept up to date, to work with AI in harmony. Not just as users, but as collaborators who understand the underlying mechanics to communicate with clients, partners, and co-workers effectively. +The same applies to AI : we don't need to be AI experts to harness the power of large language models but we do need to develop a certain level of "mechanical sympathy" with how these Models are selected, operationized, served, infered from, and kept up to date, to work with AI in harmony. Not just as users, but as collaborators who understand the underlying mechanics to communicate with clients, partners, and co-workers effectively. -It's not just about the Model itself, it's about the platform that empowers us to create trushtworthy AI applications and guides us in making informed choices. +It's not just about the Model itself; it's about the platform that empowers us to create trustworthy AI applications and guides us in making informed choices. The true power lies in the platform that enables us to harness a diverse range of AI models, tools, infrastructure and operationalize our ML projects. -That platform, *OpenShift AI* is what we learn to create, configure, and utilize to Serve LLM Models in this quick course. +That platform, *OpenShift AI*, is what we learn to create, configure, and utilize to serve LLM Models in this quick course. === The Ollama Model Framework -LLMs - Large Language Models (LLMs) can generate new stories, summarize texts, and even performing advanced tasks like reasoning and problem solving, which is not only impressive but also remarkable due to their accessibility and easy integration into applications. +LLMs - Large Language Models (LLMs) can generate new stories, summarize texts, and even perform advanced tasks like reasoning and problem-solving, which is not only impressive but also remarkable due to their accessibility and easy integration into applications. -There are a lot of popular LLMs, Nonetheless, their operation remains the same: users provide instructions or tasks in natural language, and the LLM generates a response based on what the model "thinks" could be the continuation of the prompt. +There are a lot of popular LLMs. Nonetheless, their operation remains the same: users provide instructions or tasks in natural language, and the LLM generates a response based on what the model "thinks" could be the continuation of the prompt. -Ollama is not an LLM Model - Ollama is a relatively new but powerful open-source framework designed for serving machine learning models. It's designed to be efficient, scalable, and easy to use, making it an attractive option for developers and organizations looking to deploy their AI models into production. +Ollama is not an LLM model. Ollama is a relatively new but powerful open-source framework designed for serving machine learning models. It's designed to be efficient, scalable, and easy to use, making it an attractive option for developers and organizations looking to deploy their AI models into production. ==== How does Ollama work? *At its core, Ollama simplifies the process of downloading, installing, and interacting with a wide range of LLMs, empowering users to explore their capabilities without the need for extensive technical expertise or reliance on cloud-based platforms. -In this course, we will focus on single LLM, Mistral. However, with the understanding of the Ollama Framework, we will be able to work with a variety of large language models utilizing the exact same configuration. +In this course, we will focus on a single LLM, Mistral. However, with an understanding of the Ollama Framework, we will be able to work with a variety of large language models using the exact same configuration. -You be able to switch models in minutes, all running on the same platform. This will enable you test, compare, and evalute multiple models with the skills gained in the course. +You will be able to switch models in minutes, all running on the same platform. This will enable you test, compare, and evaluate multiple models with the skills gained in the course. *Experimentation and Learning* -Ollama provides a powerful platform for experimentation and learning, allowing users to explore the capabilities and limitations of different LLMs, understand their strengths and weaknesses, and develop skills in prompt engineering and LLM interaction. This hands-on approach fosters a deeper understanding of AI technology and empowers users to push the boundaries of what’s possible.* +Ollama provides a powerful platform for experimentation and learning, allowing users to explore the capabilities and limitations of different LLMs, understand their strengths and weaknesses, and develop skills in prompt engineering and LLM interaction. This hands-on approach fosters a deeper understanding of AI technology and empowers users to push the boundaries of what’s possible. diff --git a/modules/chapter2/pages/index.adoc b/modules/chapter2/pages/index.adoc index 3837e68..c1833ef 100644 --- a/modules/chapter2/pages/index.adoc +++ b/modules/chapter2/pages/index.adoc @@ -4,10 +4,10 @@ OpenShift AI is supported in two configurations: * A managed cloud service add-on for *Red Hat OpenShift Dedicated* (with a Customer Cloud Subscription for AWS or GCP) or for Red Hat OpenShift Service on Amazon Web Services (ROSA). -For information about OpenShift AI on a Red Hat managed environment, see https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_cloud_service/1[Product Documentation for Red Hat OpenShift AI Cloud Service 1] +For information about OpenShift AI on a Red Hat managed environment, see https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_cloud_service/1[Product Documentation for Red Hat OpenShift AI Cloud Service 1]. * Self-managed software that you can install on-premise or on the public cloud in a self-managed environment, such as *OpenShift Container Platform*. -For information about OpenShift AI as self-managed software on your OpenShift cluster in a connected or a disconnected environment, see https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2.8[Product Documentation for Red Hat OpenShift AI Self-Managed 2.8] +For information about OpenShift AI as self-managed software on your OpenShift cluster in a connected or a disconnected environment, see https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2.8[Product Documentation for Red Hat OpenShift AI Self-Managed 2.8]. In this course we cover installation of *Red Hat OpenShift AI self-managed* using the OpenShift Web Console. @@ -32,15 +32,17 @@ To support the KServe component, which is used by the single-model serving platf ==== https://docs.openshift.com/container-platform/latest/hardware_enablement/psap-node-feature-discovery-operator.html[OpenShift Serveless Operator]:: +// Is this the correct link for OpenShift Serveless Operator? The *OpenShift Serveless Operator* is a prerequisite for the *Single Model Serving Platform*. https://docs.openshift.com/container-platform/latest/hardware_enablement/psap-node-feature-discovery-operator.html[OpenShift Service Mesh Operator]:: +// Is this the correct link for OpenShift Service Mesh Operator? The *OpenShift Service Mesh Operator* is a prerequisite for the *Single Model Serving Platform*. [NOTE] ==== -The following Operators are required to support the use of Nvidia GPUs (accelerators) with OpenShift AI +The following Operators are required to support the use of Nvidia GPUs (accelerators) with OpenShift AI: ==== https://docs.openshift.com/container-platform/latest/hardware_enablement/psap-node-feature-discovery-operator.html[Node Feature Discovery Operator]:: diff --git a/modules/chapter2/pages/section1.adoc b/modules/chapter2/pages/section1.adoc index 612c790..dc1fc79 100644 --- a/modules/chapter2/pages/section1.adoc +++ b/modules/chapter2/pages/section1.adoc @@ -8,7 +8,7 @@ IMPORTANT: The installation requires a user with the _cluster-admin_ role . Login to the Red Hat OpenShift using a user which has the _cluster-admin_ role assigned. -. Navigate to **Operators** -> **OperatorHub** and search for each of the following Operators individually. Click on the button or tile for each. In the pop up window that opens, ensure you select the latest version in the *stable* channel and click on **Install** to open the operator's installation view. For this lab you can skip the installation of the optional operators +. Navigate to **Operators** -> **OperatorHub** and search for each of the following Operators individually. Click on the button or tile for each. In the pop up window that opens, ensure you select the latest version in the *stable* channel and click on **Install** to open the operator's installation view. For this lab you can skip the installation of the optional operators. [*] You do not have to wait for the previous Operator to complete before installing the next. For this lab you can skip the installation of the optional operators as there is no GPU. @@ -40,4 +40,4 @@ IMPORTANT: The installation requires a user with the _cluster-admin_ role . The operator Installation progress window will pop up. The installation may take a couple of minutes. -WARNING: Do proceed with the installation past this point. In order to access the LLM remotely; There will be some modifcations to the Data Science Cluster YAML file prior to completing the installation of Red Hat OpenShift AI. \ No newline at end of file +WARNING: Do not proceed with the installation past this point. In order to access the LLM remotely; you will need to make some modifcations to the Data Science Cluster YAML file prior to completing the installation of Red Hat OpenShift AI. \ No newline at end of file diff --git a/modules/chapter2/pages/section2.adoc b/modules/chapter2/pages/section2.adoc index 2bfabb8..3868f3f 100644 --- a/modules/chapter2/pages/section2.adoc +++ b/modules/chapter2/pages/section2.adoc @@ -4,16 +4,16 @@ An SSL/TLS certificate is a digital object that allows systems to verify the identity & subsequently establish an encrypted network connection to another system using the Secure Sockets Layer/Transport Layer Security (SSL/TLS) protocol. -By default, the Single Model Serving Platform in Openshift AI uses a self-signed certificate generated at installation for the endpoints that are created when deploying a Model server. +By default, the Single Model Serving Platform in Openshift AI uses a self-signed certificate generated during installation for the endpoints that are created when deploying a Model server. -This can be counter-intuitive because the OCP Cluster already has certificates configured which will be used by default for endpoints like Routes. +This can be counterintuitive because the OCP Cluster already has certificates configured which will be used by default for endpoints like Routes. This following procedure explains how to use the same certificate from the OpenShift Container cluster for OpenShift AI. == Use OpenShift Certificates for Ingress Routes [NOTE] -Most customers will not use the self-signed certificates, opting instead to use certificates generated by their own authority. Therefore this step of adding secrets to OpenShift & OpenShift AI is common process during installation. +Most customers will not use the self-signed certificates, opting instead to use certificates generated by their own authority. Therefore, this step of adding secrets to OpenShift & OpenShift AI is common process during installation. === Navigate to the OpenShift Container Cluster Dashboard @@ -26,7 +26,7 @@ The content of the Secret (data) should contain two items, *tls.cert* and *tls.k . Select the *openshift-ingress* project from the list. . Locate the file named *ingress-certs-(XX-XX-2024)*, type should be *Opaque* . Click on the filename to open the secret, Select the *YAML Tab* - . Copy all the text from the window, insure you scroll down. (CTL-A should work). + . Copy all the text from the window, and ensure that you scroll down. (CTL-A should work). *Clean & Deploy the Secret YAML Text:* @@ -84,11 +84,11 @@ serving: Once you have made those changes to the YAML file, *Click Create* to Deploy the Data Science Cluster. -Single Model Serve Platform will now be deployed / expose ingress connections with the same certificate as OpenShift Routes. Endpoints will be accessible using TLS without having to ignore error messages or create special configurations. +Single Model Serve Platform will now be deployed to expose ingress connections with the same certificate as OpenShift Routes. Endpoints will be accessible using TLS without having to ignore error messages or create special configurations. == Epilogue -Congradulations, you have successful completed the installation of OpenShift AI on an OpenShift Container Cluster. OpenShift AI is now running as new Dashboard! +Congratulations, you have successfully completed the installation of OpenShift AI on an OpenShift Container Cluster. OpenShift AI is now running on a new Dashboard! * We Installed the required OpenShift AI Operators @@ -98,4 +98,4 @@ Congradulations, you have successful completed the installation of OpenShift AI Additionally, we took this installation a step further by sharing TLS certificates from the OpenShift Cluster with OpenShift AI. -We pick up working OpenShift AI UI in the next Chapter. \ No newline at end of file +We will pick up working with the OpenShift AI UI in the next Chapter. \ No newline at end of file diff --git a/modules/chapter3/pages/index.adoc b/modules/chapter3/pages/index.adoc index 346aa27..e638062 100644 --- a/modules/chapter3/pages/index.adoc +++ b/modules/chapter3/pages/index.adoc @@ -2,9 +2,9 @@ This chapter begins with running & configured OpenShift AI environment, if you don't already have your environment running, head over to Chapter 2. -Lots to cover in section 1, we add the Ollama custom Runtime, Create a Data Science Project, Setup Storage, Create a Workbench, and finally serving the Ollama Framework, utilizing the Single Model Serving Platform to deliver our model to our Notebook Application. +There's a lot to cover in section 1, we add the Ollama custom Runtime, create a data science project, setup storage, create a workbench, and finally serve the Ollama Framework, utilizing the Single Model Serving Platform to deliver our model to our Notebook Application. -In section 2 we will explore using the Jupyter Notebook from our workbench, infere data from the Mistral 7B LLM. While less technical than previous section of this hands on course, there are some steps download the Mistral Model, updating our notebook with inference endpoint, and evaluating our Models performance. +In section 2 we will explore using the Jupyter Notebook from our workbench to infere data from the Mistral 7B LLM. While less technical than previous section of this hands-on course, there are some steps to download the Mistral Model, update our notebook with inference endpoint, and evaluate our Models performance. -Let's get started --- \ No newline at end of file +Let's get started! \ No newline at end of file diff --git a/modules/chapter3/pages/section1.adoc b/modules/chapter3/pages/section1.adoc index 708596c..33bfd57 100644 --- a/modules/chapter3/pages/section1.adoc +++ b/modules/chapter3/pages/section1.adoc @@ -2,13 +2,13 @@ == Model Serving Runtimes -A model-serving runtime provides integration with a specified model server and the model frameworks that it supports. By default, Red Hat OpenShift AI includes the following Model RunTimes: +A model-serving runtime provides integration with a specified model server and the model frameworks that it supports. By default, Red Hat OpenShift AI includes the following Model Run Times: * OpenVINO Model Server runtime. * Caikit TGIS for KServe - * TGIS Standalong for KServe + * TGIS Standalone for KServe -However, if these runtime do not meet your needs (it doesn’t support a particular model framework, for example), you might want to add your own custom runtimes. +However, if these runtime do not meet your needs (it they don't support a particular model framework, for example), you might want to add your own custom runtimes. As an administrator, you can use the OpenShift AI interface to add and enable custom model-serving runtimes. You can then choose from your enabled runtimes when you create a new model server. @@ -17,20 +17,20 @@ This exercise will guide you through the broad steps necessary to deploy a custo [NOTE] ==== -While RHOAI supports the ability to add your own runtime, it is up to you to configure, adjust and maintain your custom runtimes. +While RHOAI supports the ability to add your own runtime, it is up to you to configure, adjust, and maintain your custom runtimes. ==== == Add The Ollama Custom Runtime . Log in to RHOAI with a user who is part of the RHOAI admin group, for this lab we will be using the admin account. -. In the RHOAI Console, Navigate to the Settings menu, then Serving Runtimes +. In the RHOAI Console, Navigate to the Settings menu, then select Serving Runtimes . Select the Add Serving Runtime button: -. For the model serving platform runtime *Select: Single-Model Serving Platform.* +. For the model serving platform runtime, *select: Single-Model Serving Platform.* -. For API protocol this runtime supports *Select: REST* +. For API protocol this runtime supports, *select: REST* . Click on Start from scratch in the window that opens up, paste the following YAML: + @@ -66,39 +66,39 @@ spec: name: any ``` -. After clicking the **Add** button at the bottom of the input area, we are see the new Ollama Runtime in the list. We can re-order the list as needed (the order chosen here is the order in which the users will see these choices) +. After clicking the **Add** button at the bottom of the input area, you will see the new Ollama Runtime in the list. We can re-order the list as needed (the order chosen here is the order in which the users will see these choices). == Create a Data Science Project Navigate to & select the Data Science Projects section. - . Select the create data science project button + . Select the create data science project button. . Enter a name for your project, such as *ollama-model*. - . The resource name should be populated automatically + . The resource name should be populated automatically. - . Optionally add a description to the data science project + . Optionally add a description to the data science project. - . Select Create + . Select Create. == Deploy MinIO as S3 Compatible Storage === MinIO overview -*MinIO* is a high-performance, S3 compatible object store. It can be deployed on a wide variety of platforms, and it comes in multiple flavors. +*MinIO* is a high-performance, S3-compatible object store. It can be deployed on a wide variety of platforms, and it comes in multiple flavors. This segment describes a very quick way of deploying the community version of MinIO in order to quickly setup a fully standalone Object Store, in an OpenShift Cluster. This can then be used for various prototyping tasks that require Object Storage. [WARNING] -This version of MinIO should not be used in production-grade environments. Also, MinIO is not included in RHOAI, and Red Hat does not provide support for MinIO. +This version of MinIO should not be used in production-grade environments. Aditionally, MinIO is not included in RHOAI, and Red Hat does not provide support for MinIO. === MinIO Deployment To Deploy MinIO, we will utilize the OpenShift Dashboard. - . Click on the Project Selection list dropdown, Select the Ollama-Model project or the data science project you created in the previous step. + . Click on the Project Selection list dropdown and select the Ollama-Model project or the data science project you created in the previous step. . Then Select the + (plus) icon from the top right of the dashboard. @@ -280,9 +280,9 @@ From the OCP Dashboard: . This will display two routes, one for the UI & another for the API. - . For the first step select the UI route, and paste it in a browser Window. + . For the first step, select the UI route and paste it in a browser Window. - . This window opens the MinIO Dashboard, login with user/password combination you set, or the default listed in yaml file above. + . This window opens the MinIO Dashboard. Log in with user/password combination you set, or the default listed in yaml file above. Once logged into the MinIO Console: @@ -295,19 +295,19 @@ Once logged into the MinIO Console: .. *storage* [NOTE] - When serving a LLM or other model Openshift AI looks within a Folder, therefore we need at least one subdirectory under the Models Folder. + When serving an LLM or other model, Openshift AI looks within a Folder. Therefore, we need at least one subdirectory under the Models Folder. - . Via the Navigation menu, *Select object browser*, Click on the Model Bucket. - . From the models bucket page, click add path, and type *ollama* as the name of the sub-Folder or path. + . Via the Navigation menu, *select object browser*, then click on the Model Bucket. + . From the models bucket page, click add path, and type *ollama* as the name of the sub-folder or path. [IMPORTANT] -In most cases to serve a model, the trained model would be uploaded into this sub-directory. *Ollama is a special case, as it can download and manage Several LLM models as part of the runtime.* +In most cases, to serve a model, the trained model would be uploaded into this sub-directory. *However, Ollama is a special case, as it can download and manage Several LLM models as part of the runtime.* . We still need a file available in this folder for the model deployment workflow to succeed. - . So we will copy an emptyfile.txt file to the ollama subdirectory. You can download the file from https://github.com/rh-aiservices-bu/llm-on-openshift/tree/main/serving-runtimes/ollama_runtime[*this location*]. Or you can create your own file called emptyfile.txt and upload it. + . So we will copy an *emptyfile.txt* file to the ollama subdirectory. You can download the file from https://github.com/rh-aiservices-bu/llm-on-openshift/tree/main/serving-runtimes/ollama_runtime[*this location*]. Alternatively, you can create your own file called emptyfile.txt and upload it. - . Once you have this file ready, upload it into the Ollama path in the model bucket, by clicking the upload button and selecting the file from your local desktop. + . Once you have this file ready, upload it into the Ollama path in the model bucket by clicking the upload button and selecting the file from your local desktop. === Create Data Connection @@ -316,28 +316,28 @@ Navigate to the Data Science Project section of the OpenShift AI Console /Dashbo . Select the Data Connection menu, followed by create data connection . Provide the following values: .. Name: *models* -.. Access Key: is the minio_root-user from YAML file -.. Secret Key: is the minio_root_password from the YAML File -.. Endpoint: is the Minio API URL from the Routes page in Openshift Dashboard -.. Region: Is required for AWS storage & cannot be blank (no-region-minio) -.. Bucket: is the Minio Storage bucket name: *models* +.. Access Key: use the minio_root-user from YAML file +.. Secret Key: use the minio_root_password from the YAML File +.. Endpoint: use the Minio API URL from the Routes page in Openshift Dashboard +.. Region: This is required for AWS storage & cannot be blank (no-region-minio) +.. Bucket: use the Minio Storage bucket name: *models* -Repeat for the Storage bucket, using *storage* for the name & bucket. +Repeat the same process for the Storage bucket, using *storage* for the name & bucket. == Creating a WorkBench Navigate to the Data Science Project section of the OpenShift AI Console /Dashboard. Select the Ollama-model project. - . Select the WorkBench button, then create workbench + . Select the WorkBench button, then click create workbench .. Name: `ollama-model` .. Notebook Image: `Minimal Python` - .. Leave the remianing options default + .. Leave the remianing options default. - .. Optionally, scroll to the bottom, check the `Use data connection box` + .. Optionally, scroll to the bottom, check the `Use data connection box`. .. Select *storage* from the dropdown to attach the storage bucket to the workbench. @@ -349,7 +349,7 @@ Depending on the notebook image selected, it can take between 2-20 minutes for t == Creating The Model Server -From the ollama-model WorkBench Dashboard in the ollama-model project, specify the **Models** section, and select Deploy Model from the **Single Model Serving Platform Button**. +From the ollama-model WorkBench Dashboard in the ollama-model project, navigate to the **Models** section, and select Deploy Model from the **Single Model Serving Platform Button**. *Create the model server with the following values:* @@ -363,7 +363,7 @@ From the ollama-model WorkBench Dashboard in the ollama-model project, specify .. Model location path: `/ollama` -After clicking the **Deploy** button at the bottom of the form, the model is added to our **Models & Model Server list**. When the model is avialable the inference endpoint will populate & the status will indicate a green checkmark. +After clicking the **Deploy** button at the bottom of the form, the model is added to our **Models & Model Server list**. When the model is available, the inference endpoint will populate & the status will indicate a green checkmark. We are now ready to interact with our newly deployed LLM Model. Join me in Section 2 to explore Mistral running on OpenShift AI using Jupyter Notebooks. diff --git a/modules/chapter3/pages/section2.adoc b/modules/chapter3/pages/section2.adoc index c2603fb..4c2d8dd 100644 --- a/modules/chapter3/pages/section2.adoc +++ b/modules/chapter3/pages/section2.adoc @@ -2,14 +2,14 @@ == Open the Jupyter Notebook -From the OpenShift AI ollama-model workbench dashboard, +From the OpenShift AI ollama-model workbench dashboard: -* Select the Open link to the right of the status section; When the new window opens, use the OpenShift admin user & password to login to the Notebook. +* Select the Open link to the right of the status section. When the new window opens, use the OpenShift admin user & password to login to the Notebook. -click *Allow selected permissions* button to complete login to the notebook. +* Click *Allow selected permissions* button to complete login to the notebook. [NOTE] -If the *OPEN* link for the notebook is grayed out, the notebook container is still starting, this process can take a few minutes & up to 20+ minutes depending on the notebook image we opt'd to choose. +If the *OPEN* link for the notebook is grayed out, the notebook container is still starting. This process can take a few minutes & up to 20+ minutes depending on the notebook image we opted to choose. == Inside the Jupyter Notebook @@ -20,26 +20,26 @@ Navigate to the llm-on-openshift/examples/notebooks/langchain folder: Then open the file: _Langchain-Ollama-Prompt-memory.ipynb_ -Explore the notebook then continue. +Explore the notebook, and then continue. === Update the Inference Endpoint Head back to the RHOAI workbench dashboard & copy the interence endpoint from our ollama-mistral model. -Return the Jupyter Notebook Environment, +Return the Jupyter Notebook Environment: . Paste the inference endpoint into the Cell labeled interfence_server_url = *"replace with your own inference address"* - . We can now start executing the code in the cells, starting with the set the inference server url cell. + . We can now start executing the code in the cells, starting with the set the inference server URL cell. . Next we run the second cell: !pip install -q langchain==0.1.14 ; there is a notice to update pip, just ignore and continue. . The third cell imports the langchain components that provide the libraries and programming files to interact with our LLM model. - . The fourth cell, place our first call to the Ollama-Mistral Framework Served by OpenShift AI. + . In the fourth cell, place our first call to the Ollama-Mistral Framework Served by OpenShift AI. [WARNING] -Before we continue we need to perform the following additional step. As mentioned, The Ollama Model Runtime we launched in OpenShift AI is a Framework that can host multiple LLM Models. It is currently running but is waiting for the command to instruct it to download Model to Serve. The following command needs to run from the OpenShift Dashboard. We are going to use the web_terminal operator to perform this next step. +Before we continue, we need to perform the following additional step. As mentioned, The Ollama Model Runtime we launched in OpenShift AI is a Framework that can host multiple LLM Models. It is currently running but is waiting for the command to instruct it to download Model to Serve. The following command needs to run from the OpenShift Dashboard. We are going to use the web_terminal operator to perform this next step. == Activating the Mistral Model in Ollama @@ -65,11 +65,11 @@ Once the download completes, the *status: success:* message appears. We can now === Create the Prompt -This cell sets the *system message* portion of the query to our model. Normally we don't get the see this part of the query. This message details how the model should act / respond / and consider our questions. This adds checks to valdiate the information is best as possible, and to explain answers in detail. +This cell sets the *system message* portion of the query to our model. Normally, we don't get the see this part of the query. This message details how the model should act, respond, and consider our questions. It adds checks to valdiate the information is best as possible, and to explain answers in detail. == Memory for the conversation -Keeps track of the conversation, this way history of the chat are also sent along with new chat information keeping the context for future questions. +This cell keeps track of the conversation, this way history of the chat are also sent along with new chat information, keeping the context for future questions. The next cell tracks the conversation and prints it to the Notebook output window so we can experience the full conversation list. @@ -77,31 +77,31 @@ The next cell tracks the conversation and prints it to the Notebook output windo The Notebooks first input to our model askes it to describe Paris in 100 words or less. -In green text is the window is the setup message that is sent along with the single sentence question to desctibe to the model how to consider and respond to the question. +In green text is the window, there is the setup message that is sent along with the single sentence question to desctibe to the model how to consider and respond to the question. -It takes ~12 seconds for the model to respong with the first word of the reply, and the final word is printed to the screen ~30 seconds after the request was started. +It takes approximately 12 seconds for the model to respond with the first word of the reply, and the final word is printed to the screen approximately 30 seconds after the request was started. -The responce answered the question in a well considered and informated paragraph that less than 100 words in length +The responce answered the question in a well-considered and informated paragraph that is less than 100 words in length. === Second Input -Notice that the Second input - Is there a River, does not specify where the location is that might have a River. because the conversation history is passed with the second input, there is not need to specify any additional informaiton. +Notice that the Second input - "Is there a River" - does not specify where the location is that might have a River. Because the conversation history is passed with the second input, there is not need to specify any additional informaiton. -The total time to first word took ~14 seconds this time, just a bit longr due the orginal information being sent. The time for the entire reponse to be printed to the screen just took over 4 seoncds. +The total time to first word took approximately 14 seconds this time, just a bit longer due the orginal information being sent. The time for the entire reponse to be printed to the screen just took over 4 seoncds. Overall our Model is performing well without a GPU and in a container limited to 4 cpus & 10Gb of memory. == Second Example Prompt -Similar to the previous example, except we use the City of London, and run a cell to remove the verbose text reguarding what is sent or recieved apart from the another from model. +Similar to the previous example, except we use the City of London, and run a cell to remove the verbose text reguarding what is sent or recieved apart from the answer from the model. -There is no change to memory setting, but go ahead and evalute where the second input; is there a river is answer correctly. +There is no change to memory setting, but go ahead and evalute where the second input; "Is there a river?" is answer correctly. == Experimentation with Model -Add a few new cells to the Notebooks +Add a few new cells to the Notebook. -Experiment with clearing the memory statement, then asking the river quetsion again. Or perhaps copy one of the input statements and add your own question for the model. +Experiment with clearing the memory statement, then asking the river question again. Or perhaps copy one of the input statements and add your own question for the model. Try not clearing the memory and asking a few questions. @@ -110,13 +110,13 @@ You have successfully deployed a Large Language Model, now test the information == Delete the Environment -Once you finished experimenting with questions, make you head back to the Red Hat Demo Platform and delete the Openshift Container Platform Cluster. +Once you finished experimenting with questions, make sure you head back to the Red Hat Demo Platform and delete the Openshift Container Platform Cluster. -You don't have to remove any of the resources, deleting the environment will remove any resources created during this lesson. +You don't have to remove any of the resources; deleting the environment will remove any resources created during this lesson. === Leave Feedback -If you enjoyed this walkthrough, please sent the team a note. +If you enjoyed this walkthrough, please send the team a note. If you have suggestions to make it better or clarify a point, please send the team a note. Until the next time, Keep being Awesome! diff --git a/package-lock.json b/package-lock.json index 0d43cf4..16a6be2 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,5 +1,5 @@ { - "name": "llm-model-serving", + "name": "llm-on-rhoai", "lockfileVersion": 3, "requires": true, "packages": {