-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added armur ai page #1865
added armur ai page #1865
Conversation
added images and links testing product arch.image added product images added milestones changed structure fixing last answer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the application, and sorry for the late reply here. Let me share this application also with @bhargavbh, who previously looked into ink! smart contract vulnerabilities. I have one question here: You say all your A.I. models are open source, but is any of your work so far actually open source? I couldn't find a GitHub organization. My main concern here is that the dataset that you will create as part of the first milestone is only useful for you and can not easily be reused by others. Another big concern that I have is that your project is simply too early for our ecosystem, given that there will be very little data to train your models.
Thanks for the application. I concur with David's concerns. What were the approx size of dataset and manual annotations (avg for each vulnerability class) needed for other smart-contract languages to create a robust (non overfitting) model? I feel there are many low hanging fruits for deterministic semantic analysis before diving in to ML based approaches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the application.
When you say that you have "built a successful A.I powered auditing tool for solidity files", can you elaborate how you define success? And can you share the training and test sets and some raw evaluation numbers (beyond "80% accurate")?
You're right, the A.I models are open source but our work up until now isn't. This is because all the work we've done until now has been for solidity smart contracts and since we haven't raised a Pre-Seed round yet, we haven't open sourced the work as this is the I.P edge we want to demonstrate to investors. |
Here's the process we followed for Solidity smart contracts -
After these 4 steps, the A.I models work really well to create a non-overfitting model. The main benefit here over a hard-coded semantic analysis tool is that we won't EVER have to write "extensions" or "plugins" for the existing code base. That's the benefit you get with an A.I driven approach - we simply need to train it in the future, rather than "maintaining" a code-case that needs to be extended. |
Hi Semuelle,
|
Have answered all the questions above. I just wanted to highlight something - We have been referred by Gavin Wood, kindly check with him once as well. This is the reason we applied for the grant. Our original plan was to become a para-chain but Robin Ejsmon-Frey and Nico Morgan suggested that grant could be the right route. |
please let me know if I should make changes in the original submitted proposal or if these comments are enough? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. Could you update the specification for the default deliverables? For example, there is no code delivered in M1, so I don't see how inline docs and Docker are relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the other milestones and deliverables, I have to say they don't really fit the grants program.
- We don't pay for deployment, unless it's an infrastructure project. And since you are planning to make the models proprietary, I don't see how that would work.
- What do you mean by testing and deploying the model(s) if you only train them in the next step? I assume this is an API you are talking about? Please review all deliverable specifications, as they should be something verifiable and reusable. For example, if you want "model selection" to be a deliverable, the test procedure and results should also be a part of the deliverable.
- 16 person months for training and testing of models seems excessive.
Hey Semuelle, I think there's a lot of confusion.
Thanks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But when we work with Polkadot, all the work will be open-sourced since in a way Polkadot will be our investors for this work and Polkadot is funding it. We will sign all your agreements that state we have to share our work with the entire community.
That's good to hear. Could you try to make this more clear as part of the milestone tables? The size of the current ink smart contract dataset is also still a concern for me. So the application might be a little bit too early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @AkhilSharma90,
ad 1: I think we are using the term 'model' differently. To me, an open source model is a trained model whose parameters and algorithms are open source. For example, to open source my neural network, I would publish its weights and biases. When you say you have "trained open source models", I hear "we have published the models we have trained", but - correct me if I'm wrong - you are referring to the fact that the software you have used to train the models is open source. Your website refers to your "proprietary technology", so I assumed you were not planning to publish the trained model. Perhaps you can clarify this.
ad 2: deployment is the title of your second milestone and listed as one of your deliverables.
ad 3: I think paying for this kind of deployment would be fine if you published the results of the work, the models and ideally the test results. 'Deployment' is usually used for backends and smart contracts around here, which we don't fund.
ad 5: It's clear to me and everyone reading your proposal that you have already put a lot of work into your project and there are not many people who could and would pursue it. However, given the novelty of the concept, I would prefer to see some hard data on its accuracy, precision and market interest, so a more iterative approach would make more sense to me. Also, since you mention datasets: there is a milestone for creating datasets and it's listed with three person-months.
Lastly: by "we need help of the Polkadot community", do you mean this grant (application) or are you planning to involve the community in different ways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the application @AkhilSharma90 in addition to above comments:
- Can you update to specify which license you will use? Currently, it still lists all four.
- The deliverables still contain mainly boilerplate text. Is Docker really necessary for all milestones?
- What kind of tests will be created and how can we evaluate them?
- For milestone 3, how can we effectively evaluate the human feedback loop that is mentioned?
pinging @AkhilSharma90 |
Hi team,
Apologies for the late reply, we have our friendly investor pitch day at
Outlier Ventures on 11th, I got busy fixing the deck and the data room :)
I'll reply to all the comments and make changes as well
…On Wed, Aug 9, 2023 at 7:49 PM Sebastian Müller ***@***.***> wrote:
pinging @AkhilSharma90 <https://github.com/AkhilSharma90>
—
Reply to this email directly, view it on GitHub
<#1865 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGKJCNPWKUOBMT6CAU2V7WLXUOL7TANCNFSM6AAAAAA2YEJMHM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
yes, i will do this and push the changes |
got it, we need atleast 5-6 code examples per categorized vulnerability to be successful. We can additionally build this dataset with you if we're too early. We're passionate about the problem statement. Being an early mover will help us a lot. If the timelines can be relaxed for us to let us help you build the datasets, will be an awesome collaboration |
Your team member, NOC2 mentioned earlier that the dataset on vulnerability is too small, the term "dataset" that he's using is actually meant to refer to the raw data of the vulnerabilities and code samples that might exist for us to be able to create a dataset from.
|
Ok I will make all these changes to the application. |
Hi Keegan, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes @AkhilSharma90 I appreciate the added technical details. However, it looks like the mandatory deliverables (0a. - 0e.) were removed from all milestones. These should stay, apologies if I wasn't clear on that. The 0d. Docker section can be removed if it's not applicable, and the 0e. blog article is only required for the last milestone. IMO would be great to have that after the grant is completed.
0b. and 0c. should remain, for "Documentation" and "Testing & testing guide" respectively. That way we have a tangible way to evaluate the work that has been accomplished. Not sure what these tests would consist of for this kind of project though. For example I'm still not exactly sure how we would evaluate milestones 3 & 4. Will there be some kind of guide or tutorial that walks us through how to interact with the A.I. models?
Hi Keegan, Please check now, requested changes done Thanks, |
@keeganquigley all ok? please let me know if any more details required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We then used Chat-GPT, LLaMa and Bard, presented these with smart contract code samples and asked them to find issues. The output from these models was used as the "Expected output" and the input code sample was used as the "input" to train our models.
This sounds like you are using the untrained model output as the expected output for your training, effectively training on incorrect data. Can you elaborate?
Do you have a list of vulnerabilities your Solidity model recognises? I just had a look at the article you linked above, and it lists attacks like "Fake tokens"/"Fake contracts" and "Governance attack"? Those are largely unrelated to the source code of the contracts, so I'm curious how those would be discoverable.
| 0e. | Article | We will publish an **article**/workshop that explains [...] (what was done/achieved as part of the grant). (Content, language and medium should reflect your target audience described above.) | | ||
| 1. | Dataset Creation | The first month, we're going to focus on finding smart contracts, audit reports and converting them into the required format to train our A.I models, so this stage is basically "Dataset" creation | | ||
| **0a.** | License | Apache 2.0 | | ||
| **0b.** | Documentation | We will not only provide documentation for the datasets but also the datasets themselves at each stage, mostly shared as a whimsical document or as github documentation and upload the datasets themsleves on github| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"whimsical"?
Hi Semuelle, Please note that while we have a working product for Solidity (and have gone through multiple stages of technical due dilligence from the two accelerators we are a part of), However, in the past few weeks while interacting with you and the Polkadot team, we feel that this is taking a very different direction. The questions that are being asked to us are as if we have a working product for Polkadot (and we have been answering the questions too). The thing is, we will have to experiment quite a bit and are not sure if the same methodology will work here. We were considering this as a collaboration to build something awesome with an experimentative approach, but you're going a lot into unnecessary details and nitty gritties. What's happening here is the same thing that's happened in all my previous interactions with the Polkadot community - long stretched out conversations that are always inconclusive. Also, there's always analysis paralysis and decision fatigue. We (Armur) are a fast growing and fast moving startup and have multiple priorities, we thought collaborating with Polkadot could lead to great results and we also believe that all the details you require to take a decision are already with you by now. It seems like what we want to build is clearly not a priority for you at the moment and if you feel this doesn't help the Polkadot ecosystem much, we're happy to close the application. Now regarding the questions you've asked -
We can do this endlessly and this is wasting our time and yours, if you feel this isn't a priority, please just let us know instead of making incorrect assumptions and asking irrelevant questions. Thanks, |
Thanks @AkhilSharma90 for your answers. I will try to address some of your concerns below:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, we will have to experiment quite a bit and are not sure if the same methodology will work here.
In that case, I recommend a smaller, initial grant to make some tests and write some reports, then to apply for a follow-up grant if the results are promising. You could break it down into a POC as Keegan suggested, or turn it into a research grant and deliver a report. I do think a dataset of ink! vulnerabilities and the results of training a model on them would be interesting.
if you feel this isn't a priority, please just let us know instead of making incorrect assumptions and asking irrelevant questions.
I asked whether output of a model is the input for your training, which I took from how you described your pipeline. Which would be unusual, as your ground truth would be highly flawed. These are very simple and relevant questions to ask from a machine learning project. Perhaps you can illustrate your training and test pipeline in more detail in the application to avoid further questions.
You're again assuming something that's wrong.
I said "it sounds like", and asked you to elaborate. Very few assumptions here.
Chat-GPT, bard etc. can already audit smart contracts, training on their outputs ensures you have your base covered.
Going by table 4 of the article you linked1, Chat-GPT has a precision of 78.7% and recall of 69.8% for a binary classification task. The false positives for the non-binary task (which you are aiming for) aren't reported, but we can assume precision and recall are worse. Training on their output means your ground truth is flawed. I think these metrics are highly relevant, so that we don't give users a false sense of security.
We were considering this as a collaboration to build something awesome with an experimentative approach, but you're going a lot into unnecessary details and nitty gritties.
What's happening here is the same thing that's happened in all my previous interactions with the Polkadot community - long stretched out conversations that are always inconclusive. Also, there's always analysis paralysis and decision fatigue.
I'm sorry you are having a hard time with the community. I cannot speak for others, but it is our responsibility to make sure the funds go to projects that further the ecosystem. This is a technical program, so making sure that the concept is technically sound is a big part of that process.
Footnotes
-
David, I., Zhou, L., Qin, K., Song, D., Cavallaro, L., & Gervais, A. (2023). Do you still need a manual smart contract audit?. arXiv preprint arXiv:2306.12338. ↩
|
@keeganquigley @semuelle
|
@AkhilSharma90 Thanks for the proposal, but I have some of the same reservations as other committee members who shared their feedback. The ink! dataset seems too sparse right now to train effective models. I'd also want more technical details on your existing proposal. The budget and timeline also seem high, especially given the unknowns. Also, open-sourcing your Solidity outputs could also help ease IP concerns. Since none of the committee members who gave feedback so far seem convinced either, I'm going to close it for now. But you're always welcome to apply for a new grant for another idea or a revised proposal, that incorporates all the critiques that were raised. |
Project Abstract
Grant level
Application Checklist
project_name.md
).@_______:matrix.org
(change the homeserver if you use a different one)