Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protobin_to_proto.py not working with newer protobuf versions #17

Open
iscle opened this issue Feb 14, 2021 · 3 comments
Open

protobin_to_proto.py not working with newer protobuf versions #17

iscle opened this issue Feb 14, 2021 · 3 comments

Comments

@iscle
Copy link

iscle commented Feb 14, 2021

I've tried to extract some .proto files from a binary (Spotify.exe to be specific) and I've noticed that the script will fail if the python protobuf library is newer than 3.11.1

However, there's still another issue. If using version 3.11.1, the script works but only extracts a few files. No errors are thrown, but I know for sure that the extracted files are not all the ones present in the file.

When doing a hex inspection of the binary I can see other proto definitions that are not getting dumped.

I might be wrong, but maybe by fixing the script to work with newer versions, the previously described issue will also fix itself. (See Edit2)

Hopefully you can do something about it,
Thanks!

Edit:
This is what the script outputs when it does not work:

Checking for wire-encoded proto files..
/home/iscle/protobin_to_proto.py:190: RuntimeWarning: Unexpected end-group tag: Not all data was converted
  descriptor = pb2.FileDescriptorProto.FromString(slice)

Checking for GZIPPED proto files..

Edit2:
Upon further inspection, the .proto files do not appear either when inspecting the binary hex (by searching for login5.proto, for example), but they are getting sent over the network (captured with Wireshark, and packet analyzed with CyberChef). This might be some sort of hiding trick by Spotify, and not an issue with the extractor itself. When I first wrote the issue I might have seen something else with the same name (the network URL).

@Bert-Proesmans
Copy link
Contributor

Yo! I glanced over the code and piecing back the functionality is going slow.
Do you happen to have some guesses about the cause of this? Also, can you provide a hex-encoded dump of a relevant segment of the binary for testing? I yeeted all electron apps from my pc because of principles.

My wild guess is about an updated proto structure for proto buffer file definitions. Let's hope that isn't the case because of the additional implied complexity during reversal.

@iscle
Copy link
Author

iscle commented Feb 17, 2021

Hi Bert,

thanks for your reply!

I just edited the issue because I tried to find the missing protos again and it turns out I was mistaken, what I saw was a URL. However, even if the proto name does not appear in the hex, they are still getting sent over the network, I confirmed it with a wireshark dump. (See edit2 from the issue).

I have uploaded two Mac OS binaries for you: "Spotify_complete" which dumps all necessary protos, and "Spotify_missing" which uses protos that don't get dumped. https://drive.google.com/drive/folders/1Qc13YAk4e1rVWLvkHxZn96PNJcOKT2_u?usp=sharing
login5.proto is one that gets used in both, but only dumped in "Spotify_complete".

Regarding the issue with the protobuf library version, I have absolutely no idea of what it could be. Maybe by doing a diff between 3.11.1 and 3.11.2 we could find the culprit.

@Bert-Proesmans
Copy link
Contributor

On Windows Python 3.8.0 x64, no errors for multiple versions of protobuf including 3.11.1, 3.11.2, 3.15.1. The two binaries indeed produce different proto files. Login-related protos seem to be missing from the Spotify_missing file.

I see the Spotify_complete file producing various files matching *login5.v3.proto at the root of proto output without actual contents. This might be a bug? Underneath the /spotify/login5/v3/ folder there are also files written (as expected, i assume).
Strangely enough, your Spotify_incomplete produces more proto files than the other one. I'm not sure what to make from it.

I've lost interest in maintaining this repository. A next clue might be discovered after compiling an example proto to a C binary with the latest proto compiler and inspecting what comes out. But this is a relatively long shot.
Since this is Spotify it isn't uncommon for them to obfuscate their internals at various points in time, reverse engineers are always a step behind 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants