-
Notifications
You must be signed in to change notification settings - Fork 132
Conversation
@@ -1,11 +1,13 @@ | |||
FROM ubuntu:xenial as openedx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are currently using Ubuntu 20.04 which is the focal release. This was outdated and causing issues hence updated this.
Dockerfile
Outdated
apt-get upgrade -qy && \ | ||
apt install -y git-core language-pack-en build-essential python3.8-dev python3.8-distutils libmysqlclient-dev && \ | ||
apt install -y git-core language-pack-en build-essential python3.8-dev python3-virtualenv \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some more dependencies that were being installed in Ansible configurations too and missing here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm..our edx-platform image uses venv, Tutor uses pyenv, and our cookiecutter template just uses the system Python installation. I think we still need to hash out a standard here.
Dockerfile
Outdated
@@ -14,20 +16,116 @@ RUN locale-gen en_US.UTF-8 | |||
ENV LANG en_US.UTF-8 | |||
ENV LANGUAGE en_US:en | |||
ENV LC_ALL en_US.UTF-8 | |||
ENV ANALYTICS_DASHBOARD_CFG /edx/etc/insights.yml | |||
ENV LMS_HOST http://localhost:18000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some more environment variables. This is just a part of configurations and other variables required by the service are in edx/etc/insights.yml
file which is loaded by analytics_dashboard->settings->yaml_config.py
file through ANALYTICS_DASHBOARD_CFG
environment variable.
Database related settings are overridden from the docker-compose
file of devstack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't think we want all these explicitly in the Dockerfile (especially things like localhost references). These feel like they should live in the YAML file described in https://open-edx-proposals.readthedocs.io/en/latest/architectural-decisions/oep-0045-arch-ops-and-config.html#configuration if there's no reasonable universal default, and in the default Django settings otherwise.
WORKDIR /edx/app/analytics_dashboard | ||
COPY requirements /edx/app/analytics_dashboard/requirements | ||
RUN python3.8 -m pip install -r requirements/production.txt | ||
ARG COMMON_APP_DIR="/edx/app" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These variables are just defined here so they can be reused throughout the Dockerfile
. They have been defined as ARG
instead of ENV
because ARG
has its lifecycle bound only to Dockerfile whereas ENV
variables live throughout the life of Docker container in the form of environment variables
As these variables have no usage as environment variables it made more sense to define them as ARG
with default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this context should be added as a comment in the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was way before we had an initial review round with Kyle. Meanwhile this is still one of the motivation/reasons to use ARG instead of ENV here, but the actual and important point behind using ARGS is the compatibility of Path structure with Tutor and other OpenedX installations as they can be provided during Image building process.
Dockerfile
Outdated
RUN npm install -g npm@${INSIGHTS_NPM_VERSION} | ||
|
||
# Tried to cache the dependencies by copying related files after the npm install step but npm post install fails in that case. | ||
# COPY package.json ${INSIGHTS_CODE_DIR}/package.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to cache the dependencies by copying code related files after the npm install
step but npm post install fails in that case so had to copy all the files before installing dependencies
Dockerfile
Outdated
ENV ANALYTICS_DASHBOARD_CFG /edx/etc/insights.yml | ||
ENV LMS_HOST http://localhost:18000 | ||
ENV HOME /root | ||
ENV PATH "/edx/app/insights/venvs/insights/bin:/edx/app/insights/nodeenvs/insights/bin:$PATH" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added python venv bin
and nodeenv bin
to path so that we don't have to access the whole path again and again and activate the env to install dependencies and run packages from there.
RUN virtualenv -p python3.8 --always-copy ${SUPERVISOR_VENV_DIR} | ||
COPY requirements ${INSIGHTS_CODE_DIR}/requirements | ||
|
||
ENV PATH="${INSIGHTS_CODE_DIR}/node_modules/.bin:$PATH" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added node_modules
to path so packages are available globally in our Docker image and we can run them directly instead of accessing the whole path again and again
configuration_files/cms.conf
Outdated
@@ -0,0 +1,11 @@ | |||
[program:cms] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the succeeding configuration files were being generated through Jinja templates in Ansible configuration. We decided not to use Jinja templates with Dockerfile and instead copy the content of those files as they aren't modified regularly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these cms.conf
and lms.conf
files here? It doesn't look the shell scripts they wrap exist in the container, and I certainly hope edx-platform isn't installed in it.
scripts/devstack.sh
Outdated
@@ -0,0 +1,26 @@ | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script is used as entry point to the service from devstack and was previously hosted in configurations repo.
@kdmccormick we created this PR against openedx-unsupported/devstack#943, tagging you if you could review this PR |
Sorry @iamsobanjaved , I don't have the capacity to review any Devstack work. |
Jeremy gave me some additional context on this. I'll take a look soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done reviewing yet, but here are my questions and comments so far.
Dockerfile
Outdated
@@ -1,11 +1,13 @@ | |||
FROM ubuntu:xenial as openedx | |||
FROM ubuntu:focal as production |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the alias be "app" instead of "production"? That seems to be the most common choice in other service repos: https://github.com/search?q=user%3Aopenedx+%22from+ubuntu%3Afocal%22&type=code . It's also used in our cookiecutter template: https://github.com/openedx/edx-cookiecutters/blob/master/cookiecutter-django-ida/%7B%7Bcookiecutter.repo_name%7D%7D/Dockerfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I had named it keeping in view that we are going to have two images one for production and the other for dev environments and we are building the other image(dev) on the top of our base image which is production in this case. If we have consensus that we should call our base image as app
instead then I will go ahead and change that.
Dockerfile
Outdated
apt-get upgrade -qy && \ | ||
apt install -y git-core language-pack-en build-essential python3.8-dev python3.8-distutils libmysqlclient-dev && \ | ||
apt install -y git-core language-pack-en build-essential python3.8-dev python3-virtualenv \ | ||
python3.8-distutils libmysqlclient-dev libssl-dev openjdk-8-jdk gettext && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need openjdk-8-jdk here? I'm not sure about gettext either. Can we switch to the format from the edx-cookiecutters template that has comments explaining what each package is needed for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I listed these packages for installation keeping in view that they were already being installed in the Ansible configuration for insights. I will remove them if they are unnecessary but I will need input from someone who might have idea as to why they were added here in the first place.
Dockerfile
Outdated
apt-get upgrade -qy && \ | ||
apt install -y git-core language-pack-en build-essential python3.8-dev python3.8-distutils libmysqlclient-dev && \ | ||
apt install -y git-core language-pack-en build-essential python3.8-dev python3-virtualenv \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm..our edx-platform image uses venv, Tutor uses pyenv, and our cookiecutter template just uses the system Python installation. I think we still need to hash out a standard here.
Also, I've updated https://openedx.atlassian.net/wiki/spaces/COMM/pages/3483205633/Dockerfile+Best+Practices with a few notes, references, and open questions. Feel free to contribute there as issues come up and get resolved in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more notes, but not a full review. I'm hoping some of the people I've poked again to provide feedback can do such a full review, or at least help us get to a point where we're comfortable with it as a first attempt.
Dockerfile
Outdated
@@ -14,20 +16,116 @@ RUN locale-gen en_US.UTF-8 | |||
ENV LANG en_US.UTF-8 | |||
ENV LANGUAGE en_US:en | |||
ENV LC_ALL en_US.UTF-8 | |||
ENV ANALYTICS_DASHBOARD_CFG /edx/etc/insights.yml | |||
ENV LMS_HOST http://localhost:18000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't think we want all these explicitly in the Dockerfile (especially things like localhost references). These feel like they should live in the YAML file described in https://open-edx-proposals.readthedocs.io/en/latest/architectural-decisions/oep-0045-arch-ops-and-config.html#configuration if there's no reasonable universal default, and in the default Django settings otherwise.
configuration_files/cms.conf
Outdated
@@ -0,0 +1,11 @@ | |||
[program:cms] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these cms.conf
and lms.conf
files here? It doesn't look the shell scripts they wrap exist in the container, and I certainly hope edx-platform isn't installed in it.
@aht007 @jmbowman a couple questions, which will affect how I review this:
|
|
@kdmccormick We are aiming that these images can be used as suitable base images for tutor as well. Also this will serve as a template to follow for other repos once this gets merged so we can apply any learnings we get from here to other tasks as well. You can also have a look at this Github issue for more context. |
In terms of Tutor compatibility, a couple issues jump out at me. the /edx/... path structureMost Tutor images have a structure like:
(They are not all 100% consistent, but that is the general scheme.) Tutor and its plugins count on this directory structure in certain ways. For example, in the LMS/CMS image, Tutor's YAML config files are mounted into the container at As far as I know, there is no Insights plugin for Tutor. So, the paths in this image are OK because there's nothing in Tutor for them to be incompatible with. However, if this Dockerfile is to be used as a template for other Tutor-compatible Dockerfiles, then I think it needs to match Tutor's path structure. patchabilityTutor's Dockerfiles, docker-compose files, etc, are all Jinja2 templates. Tutor exposes a custom Unless Tutor drops the patching system, I am not sure how it would be able to use these upstream Dockerfiles as base images. (This is something that hadn't occurred to me when we last talked Jeremy, sorry!) @regisb is on vacation until the end of August, but when he gets back I'm really curious what he'd think about this. |
I'm inclined to leave out the lms/cms.conf files and see if anything breaks, although I'm fine if you want to wait for owning squad review to do that; just flag it explicitly as a question for them. I'd like to end up with the simplest/smallest working image, which may involve dropping stuff that's only there for historical reasons; this seems like a good opportunity to attempt that cleanup, even if we ultimately have to add back a few things that turn out to still have valid but non-obvious reasons for being included. (Especially since we could then document why they're present.) On the paths - can we make the path prefix an argument to the Dockerfile? I suspect installations other than Tutor and edx.org have even different choices, and it would be nice to accommodate those differences if we can do so without adding major complexity to the Dockerfile. I suspect changing the entire root path of an installation would be challenging due to paths saved in database fields, etc., but we might be able to move or rename certain subdirectories if that gets us to better compatibility across the installations. On the patching - I guess it depends on how much value is being obtained from it. One option could be to have both the template and a default output in the source repos, so Tutor could still leverage that extension option. But it does add some complexity to the source template files that makes them harder to understand for people who aren't familiar with Jinja2 yet, and a little harder to reason about regardless. |
Reminder of this note regarding Kyle's prior art in revamping the edx-platform Dockerfile, in case any of the changes there are relevant here: openedx-unsupported/devstack#943 (comment) |
2186d96
to
380e973
Compare
@kdmccormick Can you please take a look at this again.
|
Dockerfile
Outdated
COPY . ${INSIGHTS_CODE_DIR}/ | ||
RUN npm set progress=false && npm ci | ||
|
||
COPY configuration_files/insights.yml /edx/etc/insights.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we're using relative path
with the copy
command...
Can we also use it in other places as well just for the consistency
?
like here
COPY /scripts/insights.sh ${INSIGHTS_APP_DIR}/insights.sh
COPY /configuration_files/insights.conf ${SUPERVISOR_AVAILABLE_DIR}/insights.conf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted.
Regarding this comment, I just realized that we would still need an empty |
Yup, agreed with all of this. That is exactly what I did in the edx-platform Dockerfile: https://github.com/openedx/edx-platform/blob/c1009b56f6a0b3f8acfab815e3fc55a6a1820612/Dockerfile#L189-L195. Based on my manual testing, it worked fine with Devstack! |
c2c3a71
to
af7ece5
Compare
username: ${{ secrets.DOCKERHUB_USERNAME }} | ||
password: ${{ secrets.DOCKERHUB_PASSWORD }} | ||
target: dev | ||
repository: openedx/insights-dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for reviewers: Should we push to edxops
or openedx
? It made sense to me that for openedx repos the images should also be in openedx docker hub org.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question @aht007 . Let me check with my team today and see what they think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aht007 After discussing, we'd prefer if the images are pushed to edxops
, for now at least.
Our reasoning is that these images will be used first and foremost by 2U (for Devstack and edX.org), so it makes sense if 2U has full administrative access over the image repositories. In the future, if we find that there's a general use case for the images that the community would like to support (whether that be a basis for Tutor or as an alternative installation method), at that point it would make sense to host the images from the openedx
organization.
I know there are currently some images hosted the openedx
organization, but the community doesn't support them and I don't know of anyone who uses them, so it's a bit misleading to have them there. I think it is OK to stop pushing those images to openedx
and start pushing them to edxops
instead.
All that said, I don't want to slow down edX's containerization project, so let me know if that causes a lot of logistical issues, and we can figure something else out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the change looks good to me except for where to push the docker image to.
Please make that change and I'll approve.
username: ${{ secrets.DOCKERHUB_USERNAME }} | ||
password: ${{ secrets.DOCKERHUB_PASSWORD }} | ||
target: prod | ||
repository: openedx/insights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @kdmccormick so let's make the change to edxops
here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will be commenting out the code that pushes the prod docker image for now because the repository edxops/insights
has been used for devstack image for a long time and if someone from the community is using that repository then it would break their development environment. Prod images are also not a priority for now and I will discuss with Jeremy about them next. Nevertheless I will change the org to edxops
for both images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some weirdness trying to get this all to work locally, see comment in openedx-unsupported/devstack#976 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test using the docker changes in openedx-unsupported/devstack#976 worked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor suggestion, otherwise looks good to me.
WORKDIR /edx/app/analytics_dashboard | ||
COPY requirements /edx/app/analytics_dashboard/requirements | ||
RUN python3.8 -m pip install -r requirements/production.txt | ||
ARG COMMON_APP_DIR="/edx/app" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this context should be added as a comment in the file?
a180a09
to
716e866
Compare
username: ${{ secrets.DOCKERHUB_USERNAME }} | ||
password: ${{ secrets.DOCKERHUB_PASSWORD }} | ||
target: dev | ||
repository: edxops/insights-dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, can this be in the openedx Dockerhub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ISSUE: #1328
This PR is part of effort aimed at removing Ansible based configurations and replacing them with
Dockerfile
. Currently Devstack Docker images are built usingAnsible based configurations
in theconfigurations
repository. Through this effort we will make sure that the Repo has its ownDockerfile
which has all the necessary configurations to setup small production and dev environments.Steps to run this Image with Devstack:
dev
i.e.docker build -t image-name-of-choice --target dev .
docker compose
file ofdevstack
and replace the existinginsights
image with the one that you built without changing any other configurations there.make dev.up.insights
in the terminal while in the devstack directory.insights
andanalyticsapi
services.