Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Workbench for Google Cloud Workstations image #564

Merged
merged 87 commits into from
Aug 9, 2023
Merged

Conversation

ianpittwood
Copy link
Collaborator

Adds the image definition for the initial release of Workbench for Google Cloud Workstations

@CLAassistant
Copy link

CLAassistant commented Jun 22, 2023

CLA assistant check
All committers have signed the CLA.

@ianpittwood ianpittwood marked this pull request as ready for review June 28, 2023 22:40
@ianpittwood ianpittwood added the gcw Related to Workbench for Google Cloud Workstations label Jun 28, 2023
@ianpittwood ianpittwood self-assigned this Jun 29, 2023
Copy link
Contributor

@bschwedler bschwedler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of my comments are geared toward attempting to reduce the size of the image layers where possible.

I may have missed some places where we can do this, especially if they were buried in a script.
Also, is there an R package cache we can remove after packages are installed like I suggested for pip? I don't know as much about managing packages there.

Note: I did not do a thorough review of the configuration files

workbench-for-google-cloud-workstations/.env Outdated Show resolved Hide resolved
.github/workflows/build-workbench-for-gcw.yaml Outdated Show resolved Hide resolved
workbench-for-google-cloud-workstations/Dockerfile Outdated Show resolved Hide resolved
workbench-for-google-cloud-workstations/Dockerfile Outdated Show resolved Hide resolved
workbench-for-google-cloud-workstations/Dockerfile Outdated Show resolved Hide resolved
workbench-for-google-cloud-workstations/Dockerfile Outdated Show resolved Hide resolved
workbench-for-google-cloud-workstations/README.md Outdated Show resolved Hide resolved
workbench-for-google-cloud-workstations/README.md Outdated Show resolved Hide resolved
@ianpittwood ianpittwood requested a review from bschwedler July 6, 2023 19:43
@ianpittwood
Copy link
Collaborator Author

Setting as blocked by #583. The issue has been escalated to Google and we are awaiting their response.

@ianpittwood ianpittwood requested a review from bschwedler August 8, 2023 19:20
Copy link
Contributor

@msarahan msarahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. I left some comments about things to talk about, but I don't think any immediate change is needed.

RSW_NAME=rstudio-workbench
PYTHON_VERSION=3.10.12
PYTHON_VERSION_ALT=3.9.17
PYTHON_VERSION_JUPYTER=3.10.12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confused me, as I thought that it was a third installation of python. Why are we explicitly setting this? Why isn't it just always presumed to be installed at PYTHON_VERSION?

Copy link
Collaborator Author

@ianpittwood ianpittwood Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches the pattern of the other Workbench products. I personally have no problem defaulting to one version or the other, but we should probably make it a global change for the repo. It would make more sense to default the Jupyter install to the primary or alt version. I feel like this only causes problems for users, though I doubt many people will build directly from this project.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,2 @@
CRAN=https://packagemanager.posit.co/cran/__linux__/focal/latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we should be leaving this as the real CRAN URL, but making RSPM take priority. This is not especially a question about this PR, though, so perhaps it is best discussed elsewhere.

&& rm -rf /var/lib/apt/lists/*

### Install R versions ###
RUN curl -O https://cdn.rstudio.com/r/ubuntu-2004/pkgs/r-${R_VERSION}_1_amd64.deb \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it might be nicer to write a proper bash file here, and then use a loop for these duplicated lines. I'm ambivalent. It would be extra work. It would make the Dockerfile easier to read and update. It might make it easier to run/debug scripts outside of docker builds.

Up to you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Mike. I think this could be useful, and I agree with your later comment about making this a follow-up issue so this can be addressed across multiple images.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& rm -f /usr/lib/rstudio-server/bin/license-manager

### Install Jupyter and extensions ###
RUN /opt/python/"${PYTHON_VERSION_JUPYTER}"/bin/python -m venv /opt/python/jupyter \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this stuff has me feeling like we should have centralized scripts for this stuff. We must be duplicating a ton of this stuff between this and the other workstations image. I don't think it's important to consolidate it all here in this PR, but perhaps it is worth filing a tech debt ticket for.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& rm -rf /var/lib/apt/lists/*

### Install R versions ###
RUN curl -O https://cdn.rstudio.com/r/ubuntu-2004/pkgs/r-${R_VERSION}_1_amd64.deb \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Mike. I think this could be useful, and I agree with your later comment about making this a follow-up issue so this can be addressed across multiple images.

workbench-for-google-cloud-workstations/TurboActivate.dat Outdated Show resolved Hide resolved
LANGUAGE=en_US:en
LC_ALL=en_US.UTF-8
JobType: any
Environment: PATH=/opt/python/3.9.16/bin:$PATH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this? This caused some issues on the DataOps platform, especially because it was pre-pended to the path.

If we do, we need to update this to the appropriate patch version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no clue! I noticed that these files tend to get wildly out of date in ever project that uses them. If we should keep it and update it, I'm happy to. If we can remove it, I would also be happy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gcw Related to Workbench for Google Cloud Workstations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants