Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runfiles API: obtaining the symlink path within runfiles_dir #288

Open
jwnimmer-tri opened this issue Dec 12, 2024 · 3 comments
Open

Runfiles API: obtaining the symlink path within runfiles_dir #288

jwnimmer-tri opened this issue Dec 12, 2024 · 3 comments

Comments

@jwnimmer-tri
Copy link

jwnimmer-tri commented Dec 12, 2024

Description of the problem / feature request:

This is a feature request for an API addition to the runfiles library. I would like to have a way to return a runfiles path that lives inside the $RUNFILES_DIR (which will be symlink to its actual home), rather than having the runfiles library return the symlink's destination.

Chasing the symlink inside the Runfiles library loses information. Details below.

Feature requests: what underlying problem are you trying to solve with this feature?

Consider this demo program: https://github.com/jwnimmer-tri/repro/tree/bazel-runfiles-paths/demo

The program uses two runfiles: the mesh.obj which is part of the source tree, and the mesh.mtl which is the output by a genrule. As is conventional for obj files the text of the obj mentions a material library filename by saying mtllib mesh.mtl, and refers to a file in the same directory. For any software to load that file, the two files (obj and mtl) must be in the same directory.

However, if I run the program I see this output (bazel 8.0.0, rules_cc 0.0.17):

$ bazel run //demo:program
The goal is for all of these paths to be in the same directory:

mesh_path: /home/jwnimmer/jwnimmer-tri/repro/demo/mesh.obj
mtl_path: /mnt/nobackup/cache/bazel/_bazel_jwnimmer/d99b0c65cbdeed7837d9eb064003636b/execroot/_main/bazel-out/k8-fastbuild/bin/demo/mesh.mtl

RUNFILES_DIR=/mnt/nobackup/cache/bazel/_bazel_jwnimmer/d99b0c65cbdeed7837d9eb064003636b/execroot/_main/bazel-out/k8-fastbuild/bin/demo/program.runfiles and contains these files:
  MANIFEST
  _main
  _main/demo
  _main/demo/mesh.mtl
  _main/demo/mesh.obj
  _main/demo/program
  _repo_mapping

When the program calls Rlocation, the result is a path to the actual file -- for the obj we get the source tree and for the generated file we get the output path. That's fine in many cases, but when we have groups of inter-related related files and some of them are source files others are build outputs, we often need to have them in a single logical directory even if stored elsewhere. This is a requirement of many file formats that weave together multiple files into a single entity, to have their sub-asset files in the same directory as the main file.

It turns out we already have a reasonable directory with logical filenames no matter where they physically came from -- the files under RUNFILES_DIR are laid out logically, as seen in the RUNFILES_DIR walk above.

I would like to be able to call Rlocation and get paths like this:

  • /mnt/nobackup/cache/bazel/_bazel_jwnimmer/d99b0c65cbdeed7837d9eb064003636b/execroot/_main/bazel-out/k8-fastbuild/bin/demo/program.runfiles/_main/demo/mesh.obj
  • /mnt/nobackup/cache/bazel/_bazel_jwnimmer/d99b0c65cbdeed7837d9eb064003636b/execroot/_main/bazel-out/k8-fastbuild/bin/demo/program.runfiles/_main/demo/mesh.mtl

Proposed solution

If the class Runfiles object had some kind of flag or option to opt-in to returning RUNFILES_DIR relative paths, instead of the manifest paths, that would solve the problem. It could either be a constructor argument, or an option argument to Rlocation.

Work-arounds

Prior to bzlmod, I could scrape the EnvVars for RUNFILES_DIR and then tack on the workspace name and resource path afterward, to find the shape of path I need. Now with bzlmod, we have the _repo_mapping rewriting happening, which is not realistically possible to re-implement myself. My current work-around is to #define private public and then clear() the runfiles_map_ private member, in which case the Runfiles object always returns relative paths to the runfiles dir.

What operating system are you running Bazel on?

Ubuntu 22.04

What's the output of bazel info release?

release 8.0.0

What version of rules_cc do you use? Can you paste the workspace rule used to fetch rules_cc? What other relevant dependencies does your project have?

See example link above for the MODULE.bazel of the reproducer. Only rules_cc == 0.0.17.

What Bazel options do you use to trigger the issue? What C++ toolchain do you use?

See example link above for reproducer. Local toolchain.

Have you found anything relevant by searching the web?

The https://groups.google.com/g/bazel-discuss/c/DsVivJhU7Bw discussion is loosely relevant.

Any other information, logs, or outputs that you want to share?

N/A

@jwnimmer-tri
Copy link
Author

If the team has guidance on what kind of API would be suitable, I'm happy to write the PR and tests. Also if you have guidance on whether/how to coordinate with the runfiles libraries for other languages (java, python, etc.) let me know. For my part I only need rules_cc but could also push the rules_python.

@jwnimmer-tri
Copy link
Author

It looks like rules_python already has an API to create a Runfiles object that is always directory-based:

https://github.com/bazelbuild/rules_python/blob/66a8b5b595710bd107c31ad5d449593536effb76/python/runfiles/runfiles.py#L390-L391

That would also be a solution here. I'll work on a pull request to copy that same API into C++.

@fmeum
Copy link
Contributor

fmeum commented Dec 22, 2024

It's worth keeping in mind that there can be situations in which the runfiles directory isn't available or isn't up-to-date (e.g. on Windows or on Unix with --noenable_runfiles). Forcing its use may thus end up resulting in incorrect and potentially non-hermetic behavior. But of course this is totally fine if your code isn't used as a dependency by other Bazel projects.

If a precise file layout in runfiles is important, I would recommend adding a rule that collects all relevant files in a declared output directory (ctx.declare_directory) in the relevant layout. You can then look up one runfile and expect all the other to be in the correct relative location on all OSes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants