Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New practice exercise - Perceptron #678
New practice exercise - Perceptron #678
Changes from 6 commits
3c62326
7b291e5
053cb63
c821340
a4bc759
185e88c
475efe3
4a24b2c
e4760e2
39ba0fa
735743f
121e5fa
dd96c6c
f7d054f
7bcaf89
4ac9805
c8d226b
fe7eb26
bed28be
9d9ce4b
76f21f2
ff9ce68
9c8193d
5ac2eec
7ec87e8
0af5de7
233f618
f444359
d963209
c947630
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should maybe be a test failure. It would simplify the messaging for the student in the instruction and from the tests. Do you think that would be okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, negative numbers are scalars (at least to me?), so I would probably drop the first part of the first sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to strictly penalize a flipped normal since it is just a convention (logical though it may be) and it would at still tell the student they have successfully found a decision boundary. In any case, I think it would be nice to leave this in, but if you think it's better to enforce the convention (which of course is good practice), we can turn this into a failed test, with it's own error message.
I've rephrase the scalar multiplication part to make more clear what I was trying to say (that a hyperplane looks the same multiplied by any scalar, but negative scalars flip what we define to be the normal). Hopefully that's clearer :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we ask the student to define a classify(boundary, point) function then they'll have to get consistent about the boundary anyway?
See #678 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: moving this down here to split up topics
I was thinking about this, and it appears it is redundant to test unseen points which are guaranteed to be on the correct side of any possible decision boundary for a population. That's because there are two hyperplanes which define the limits of a cone of possibility, and each of these limits can be defined by two points, one from each class. These two points work like support vectors, and are robust to any additional (guaranteed valid) unseen points. This means only the support vectors need to be classified correctly to verify that the hyperplane is a valid decision boundary, and this is already done under the current testing.
For this reason I think it would be best to leave out testing of unseen points, since, for students encountering this for the first time, it could give the unintended impression that Perceptron will always correctly classify unseen points.
Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I understand your intentions, but I want to check if I'm getting everything:
I can possibly understand this two ways, so I'll take a stab at both:
This actually still leaves an ambiguity. If a student presents an incorrect decision boundary (DB), the unseen point is not guaranteed to be either correctly or incorrectly classified. It all depends on the particular DB. Furthermore, if the point happens to be classified correctly, it can give the impression that the DB is correct instead of properly relaying the actual information of being just not incorrect, thereby misleading the student.
Just a note: the reason I said that any test that can be done with unseen points can be done as well or better with seen-points is that correct classification of a maximum of four special points (support vectors) out of the seen-points is both necessary and sufficient to tell if the DB is 100% correct or not. Any other test is technically deficient on its own, and deficiency introduces ambiguity. On the other hand, any other in conjunction with classifying the support vectors is superfluous. However, finding them is a non-trivial task, so every point is exhaustively classified in testing (to make sure they are found and tested).
I believe this is what the pseudorandom test set does and accuracy is required to be 100%. Is there something further that I'm missing?
I believe this adds unnecessary confusion when there is a failed test as to which function is causing the failure.
At this point, even if this is the case, the added possibility of confusion created through various ambiguities, and/or the difficulties in trying to mitigate them, make including these features seem to be not worth it. I don't mind including ambiguity in exercises, since it provides 'real world' experience, but it depends on the audience. From what I gather from your concern in this point, you are looking out for less experienced students, so I would presume less ambiguity would be better?
Thanks! I was trying to think of the simple way you had mentioned, but I had lost the trees for the forest :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I just wanted to clarify, I am on board with a classification function if it's tested separately before being used in conjunction with the perceptron function.
It's really the unseen points I'm most hesitant to accept. I think they would be very appropriate in the case of SVM, but the nature of Perceptron just seems to make them more of a bane than a benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your belief that unseen points will cause ambiguity (test failures for valid decision boundaries) is making you think that they're a big problem.
But, as I hope the triangle centroid thing shows, we can generate unseen points with zero chance of them causing false test failures.
If you don't believe the triangle thing, then maybe you can believe that, for linearly separable datasets, any point contained by the convex hull of the points of one label will be given that label by any linear boundary-based classifier that has zero error on the training set. For a linear boundary and a 2D space that one should be provable geometrically with a ruler and some drawings.
Or you can easily show that any point on the line segment between the support vectors for a label must be classified correctly by any valid linear decision boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have failed to unmuddle my concern :) I'm okay with testing to demonstrate how classification of unseen points works, but testing as proposed is too easily confused with testing unseen points to verify the correctness of a decision boundary. My thoughts on why testing unseen points to verify correctness is undesirable follow:
I'm with you on the centroid idea (which I quite liked) and familiar with the other arguments you've given, so I'm not concerned about test failures for valid decision boundaries. The minor issue I have in the case of valid decision boundaries is that testing with unseen points to validate a decision boundary is simply redundant. Why it is redundant can be gleaned from your arguments on why and how we can choose correct unseen points: namely they are chosen entirely in reference to seen points.
One issue that concerns me involves invalid decision boundaries. With an invalid decision boundary, there is no guarantee how an unseen point will be classified (just as there is no guarantee how any point will be classified). There are two cases:
The other issue that concerns me is on a more psychological note: People often take partial information, over-generalize and create false impressions (e.g. stereotypes/prejudices/etc). That's how some of the more naive or inattentive students, seeing testing on unseen points in this way, could walk away with the mistaken impression that Perceptron can correctly classify any unseen point (not just carefully selected points). This is the second aspect I dislike.
So, with what the possible outcomes are from testing unseen points, redundant good info or unnecessary potentially misleading info, and the possible misconception about Perceptron that can be conveyed simply by including the testing, I don't feel the testing of unseen points in the proposed manner brings enough to the table to be included.
To boil it all down: I feel that 'testing' unseen points to demonstrate how classification works has value, but I think this is too easily confused with testing to verify correctness of the decision boundary. I currently can't think of tests which can eradicate my concerns above. However, there are related ideas we could demonstrate.
For example, we could show that the introduction of a new point to the data will (in all likelihood) result in a different decision boundary. This doesn't involve classification of unseen points though. As an aside: with SVM, we could introduce an element of a demonstration of unseen point classification switching with a before and after approach, but it just doesn't work with Perceptron because of the non-uniqueness of the decision boundary (i.e. we can't be guaranteed a switch in classification).
Do you have any other ideas on ways to demonstration classification of unseen points?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cost vs benefit analysis that is going on in my head is the following:
The exercise is first and foremost meant to have students learn about the Perceptron algorithm. Including a demonstration of how classification works would be a bonus, but I just am not sure this type of testing environment properly supports demonstration of this kind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This equation and the next two paragraphs are not clear enough, imo.
The update rule is where the magic happens, so it would be nice to explain this better and try to give the student some insight into how the rule works.
The$\pm\mp$ stuff will also be impenetrable to most readers, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got rid of the plus/minus stuff, since I agree it's confusing/distracting and it really isn't terribly important anyway.
Were you thinking along the lines of more explanation of how the update makes the line move? Or were you thinking more clarity on how to implement it? I had left the implementation details a little bit vague intentionally (stopping short of pseudocode), hoping to leave room for the student to be creative or do some research for their implementation. However, I'm aware that some students are at a learning level where this is frustrating. Let me know you thoughts :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I have time and energy to attempt the exercise myself during the week and maybe that will help me clarify my ideas :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also don't support rendering equations as far as I can recall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it? I don't see a perceptron exercise in problem-specifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I fixed this or not. I've deleted the auto-generated comment. I had to manually create the UUID since when I ran configlet, it kept saying everything was done even when there were no UUIDs in this file (sorry I can't remember the exact message). Let me know if I need to do something else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not very familiar with what configlet needs when we're writing a practice exercise from scratch (one where there is not an upstream definition in problem-specifications). Someone in @exercism/reviewers might know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at some problem-specifications, I'm wondering if psedorandom test generation is possible on the platform. Checking
canonical-data.json
for Alphametics shows each exercise with data such as a unique UUID, plus the input and expected result. If this file is necessary and needs the hardcoded input and expected output, although possible, it'll be a bit more work to produce.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
canonical-data isn't required. You can see parametric tests in the rational-numbers exercise and elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I'd forgotten about the testset at the end of Rational Numbers :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should include some tests with manually specified input data of just a few points to make this more approachable and also as good practice (e.g. first few tests could be spaces with just 2-4 points placed manually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not sure if I've understood, but there are four tests with (six) manually specified points to illustrate a couple of different possible orientations of a hyperplane (testset "Low population"). After that the 40 pseudorandomly generated tests begin (testset "Increasing Populations"). Was this what you meant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant we could take some of the manually specified examples out of the runtestset() function and test them without the support functions, just so that it's less mystifying to a student reading the tests.
And maybe we should have the student write their own function for finding the computed label of a point?
e.g.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! I can throw a couple in there like your first examples, but I've got a question about the
classify
:Due to the wide range of possible decision boundaries returned by Perceptron, beyond carefully selected examples, I'm not sure if testing classification of unseen points is viable. Also, since the algo is effectively just the three parts of classify + update + repeat, couldn't requiring a separate classification function introduce an unnecessary constraint on the task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
I don't see why it would constrain the student or the task for us to ask them to provide a classify function. What do you mean by that?
I agree that testing with unseen points is a bit tricky, but it's not super hard given that we control the input data and it's not too bad for us to either only include hand-written tests or to do some geometry to ensure our unseen pointd will always be classified correctly by a valid linear boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, over-generalization from me there :) I guess I'm considering the basic (pedagogical) Perceptron, i.e. one provided with small, dense, separable populations, since, under separability, the learning rate doesn't affect either the final loss or the upper bound of the number of errors the algorithm can make. That said, I was wrong to say the initial hyperplane affects these, since it doesn't either. It just seems non-trivial to me because it can be returned.
I'll try to make the necessary changes to the PR today and/or tomorrow :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything you would add (e.g. subtyping) to my suggestion for a slug, or do you think I could just copy and paste?
Edit: In my initial suggestion for the slug, conversion to Float64 is enforced, but the exercise (as presented) could also be handled entirely with integers. Should I drop all references to Float64? Something like:
It might make it appear cleaner and less intimidating for students unfamiliar with the type system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the struct is really adding much, so I'd remove it. You can pick what you like, tho.
If I try the exercise and hate the struct then I might veto it, but not until then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you be more specific? There are already four tests with six manually specified points which check for different possible orientations of a decision boundary.
Beyond this, we've had an extensive conversation about using "unseen points". (TLDR: I believe testing unseen points to be potentially more detrimental to understanding than beneficial)
The other ideas for tests were of the type to see if the student is returning the correct object (vector of three real numbers), etc. Is there something else you were thinking?