-
-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New practice exercise - Perceptron #678
Conversation
It looks like a lot of checks failed, so I apologize since I'm not sure what happened. |
Would you mind explaining what you intend to do in your commit? Particularly for the test cases, it would be better to provide any math explanation. If so, I can try to help with your pull request after next week. |
Edit: I just had a look at the TLDR: I have experience creating coding exercises, but, while I've done my best to read the Exercism docs, I still have quite limited experience with GitHub and the Exercism contribution procedure, so that is likely where I'll need the most help. Excuse me if I misunderstand your question, since I'm new in this process, but I'll give a bit more mathematical description of the functions I use in creating the test cases and testing the returned values. In the test cases, there are two types: fixed and pseudorandom. The fixed cases are just an array of six points and an array of labels corresponding to the points' class. The pseudorandom tests use the function The task, put simply, is to return a line (in hyperplane format) which can completely separate the points with one class on one side and the other class on the other side. The other function in That being said, I believe there should be no problem with the mechanics/mathematics of the exercise, since I've tested it quite thoroughly a couple of places. I figure there might be an issue with configuration, since this is my first contribution here and I've likely messed something up. For example, I'm not sure if Exercism supports this type of pseudorandom test generation since I've only seen the fixed test type which usually get a separate UUID for each one in Please feel free to ask any further clarifications. |
After looking over the errors more, it looks like I've seen extra files added to the two normal files ( In other words, is it possible to get |
I pushed a merge commit so we can see how CI likes it when running on a newer version of the repo. If that is inconvenient for you, please let me know and I won't do it again :) |
Thanks Colin! It looks like there's still an issue with testing. However, I think it's a different (but possibly related) error from what I saw before. The I didn't want to put the testing function |
You're welcome :) The error is probably because we override I'm not sure there is a good way to hide the test code from the student. I'd just put any spoilery functions at the bottom of the test file with a note saying "Reading past here will give you the answer, so watch out!" |
Added helper functions from `testtools.jl` to bottom of file so it doesn't have to be included. Since the `dotest` function does not show a working solution (only how to check for one), I think a spoiler warning is unnecessary, and could just end up attracting more attention than without.
Tests still failed with helper functions at the bottom of the file, so I'm moving them to the top to see if that changes things.
It seems to be working now* :) I've put the helper functions at the head of I've left the It's too bad that we can't include other files for testing. It would make testing coding problems with pseudorandom test generation much easier since we could use a working reference solution to check against the students'. However, this exercise doesn't have an explicit working solution, so it's not a huge concern. *just a note: the |
The main motivation is that you want testtools.jl not to appear in the web UI, right? I'm not sure that's practical while also having it be downloaded for people who use the exercism CLI. |
We can put the functions at the end of the file if we invoke the testset within a function and only call that function at the end of runtests.jl |
Wrapped the test set in a function to be called at the end of the file so the helper functions can be put at the bottom of the file.
Deleting since test helper functions have been moved to `runtests.jl`
Oh, good point! I hadn't considered compatibility with the CLI. Cheers! I've put the test set in a function and called it at the end of the file, as you suggested, and it looks like all the tests have passed. I've also deleted the @cmcaine One other thing I would like to ask about is attribution. I had put my name as author and yours as contributor just to not leave those fields blank, but is it okay to leave them blank? I don't really care to be listed as an author and I don't know if you want to be credited as a contributor or not. |
Cool, thanks for fixing these things up :) I'll read through it properly in the next couple of days! I don't know what happens if no authors are listed. If you strongly prefer not to be listed as the author on the website then you could try to credit |
Thanks for the help! :) I'm quite flexible on the author situation, so if it's any hassle at all, I certainly don't mind going with the norm. Feel free to modify/leave that (or anything else) as you see fit! |
@@ -0,0 +1,25 @@ | |||
# This is an auto-generated file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it? I don't see a perceptron exercise in problem-specifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I fixed this or not. I've deleted the auto-generated comment. I had to manually create the UUID since when I ran configlet, it kept saying everything was done even when there were no UUIDs in this file (sorry I can't remember the exact message). Let me know if I need to do something else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not very familiar with what configlet needs when we're writing a practice exercise from scratch (one where there is not an upstream definition in problem-specifications). Someone in @exercism/reviewers might know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at some problem-specifications, I'm wondering if psedorandom test generation is possible on the platform. Checking canonical-data.json
for Alphametics shows each exercise with data such as a unique UUID, plus the input and expected result. If this file is necessary and needs the hardcoded input and expected output, although possible, it'll be a bit more work to produce.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
canonical-data isn't required. You can see parametric tests in the rational-numbers exercise and elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I'd forgotten about the testset at the end of Rational Numbers :)
using Test, Random | ||
include("perceptron.jl") | ||
|
||
function runtestset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should include some tests with manually specified input data of just a few points to make this more approachable and also as good practice (e.g. first few tests could be spaces with just 2-4 points placed manually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not sure if I've understood, but there are four tests with (six) manually specified points to illustrate a couple of different possible orientations of a hyperplane (testset "Low population"). After that the 40 pseudorandomly generated tests begin (testset "Increasing Populations"). Was this what you meant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant we could take some of the manually specified examples out of the runtestset() function and test them without the support functions, just so that it's less mystifying to a student reading the tests.
And maybe we should have the student write their own function for finding the computed label of a point?
e.g.
# Student must implement both `perceptron(points, labels) -> boundary (a vector of 3 weights)` and `classify(boundary, point)`
@testset "Boundary is a vector of 3 weights" begin
boundary = perceptron([[0,0], [3, 3]], [1, -1])
@test eltype(boundary) <: Real
@test length(boundary) == 3
end
@testset "Originally provided points should be classified correctly" begin
boundary = perceptron([[0,0], [3, 3]], [1, -1])
@test classify(boundary, [0, 0]) == 1
@test classify(boundary, [3, 3]) == -1
end
@testset "Given 3 labeled points, an unseen point is classified correctly" begin
# Adding more points constrains the location of the boundary so that we can test
# the classification of unseen points.
boundary = perceptron([[0,0], [1, 0], [0, 1]], [-1, 1, 1])
@test classify(boundary, [0, 0]) == -1
@test classify(boundary, [2, 5]) == 1
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! I can throw a couple in there like your first examples, but I've got a question about the classify
:
Due to the wide range of possible decision boundaries returned by Perceptron, beyond carefully selected examples, I'm not sure if testing classification of unseen points is viable. Also, since the algo is effectively just the three parts of classify + update + repeat, couldn't requiring a separate classification function introduce an unnecessary constraint on the task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
I don't see why it would constrain the student or the task for us to ask them to provide a classify function. What do you mean by that?
I agree that testing with unseen points is a bit tricky, but it's not super hard given that we control the input data and it's not too bad for us to either only include hand-written tests or to do some geometry to ensure our unseen pointd will always be classified correctly by a valid linear boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the learning rate trivial?
Sorry, over-generalization from me there :) I guess I'm considering the basic (pedagogical) Perceptron, i.e. one provided with small, dense, separable populations, since, under separability, the learning rate doesn't affect either the final loss or the upper bound of the number of errors the algorithm can make. That said, I was wrong to say the initial hyperplane affects these, since it doesn't either. It just seems non-trivial to me because it can be returned.
I'll try to make the necessary changes to the PR today and/or tomorrow :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything you would add (e.g. subtyping) to my suggestion for a slug, or do you think I could just copy and paste?
Edit: In my initial suggestion for the slug, conversion to Float64 is enforced, but the exercise (as presented) could also be handled entirely with integers. Should I drop all references to Float64? Something like:
mutable struct Perceptron
# instantiates Perceptron with a decision boundary
# this struct can remain unmodified
dbound
Perceptron() = new([0, 0, 0])
end
function fit!(model::Perceptron, points, labels)
# updates the field dbound of model (model.dbound) and returns it as a valid decision boundary
# your code here
end
function predict(model::Perceptron, points)
# returns a vector of the predicted labels of points against the model's decision boundary
# your code here
end
It might make it appear cleaner and less intimidating for students unfamiliar with the type system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the struct is really adding much, so I'd remove it. You can pick what you like, tho.
If I try the exercise and hate the struct then I might veto it, but not until then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should include some tests with manually specified input data of just a few points to make this more approachable and also as good practice (e.g. first few tests could be spaces with just 2-4 points placed manually).
Agree with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with this
Could you be more specific? There are already four tests with six manually specified points which check for different possible orientations of a decision boundary.
Beyond this, we've had an extensive conversation about using "unseen points". (TLDR: I believe testing unseen points to be potentially more detrimental to understanding than beneficial)
The other ideas for tests were of the type to see if the student is returning the correct object (vector of three real numbers), etc. Is there something else you were thinking?
|
||
We will be dealing with a two dimensional space, so our divider will be a line. The standard equation for a line is usually written as $y = ax+b$, where $a,b \in \Re$, however, in order to help generalize the idea to higher dimensions, it's convenient to reformulate this equation as $w_0 + w_1x + w_2y = 0$. This is the form of the [hyperplane](https://en.wikipedia.org/wiki/Hyperplane) we will be using, so your output should be $[w_0, w_1, w_2]$. In machine learning, ${w_0,w_1,w_2}$ are usually referred to as weights. | ||
|
||
While hyperplanes are equivalent under scalar multiplication, there is a difference between $[w_0, w_1, w_2]$ and $[-w_0, -w_1, -w_2]$ in that the normal to the hyperplane points in opposite directions. By convention, the perceptron normal points towards the class defined as positive, so this property will be checked but not result in a test failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should maybe be a test failure. It would simplify the messaging for the student in the instruction and from the tests. Do you think that would be okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, negative numbers are scalars (at least to me?), so I would probably drop the first part of the first sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should maybe be a test failure. It would simplify the messaging for the student in the instruction and from the tests. Do you think that would be okay?
I didn't want to strictly penalize a flipped normal since it is just a convention (logical though it may be) and it would at still tell the student they have successfully found a decision boundary. In any case, I think it would be nice to leave this in, but if you think it's better to enforce the convention (which of course is good practice), we can turn this into a failed test, with it's own error message.
Also, negative numbers are scalars (at least to me?), so I would probably drop the first part of the first sentence.
I've rephrase the scalar multiplication part to make more clear what I was trying to say (that a hyperplane looks the same multiplied by any scalar, but negative scalars flip what we define to be the normal). Hopefully that's clearer :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we ask the student to define a classify(boundary, point) function then they'll have to get consistent about the boundary anyway?
See #678 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: moving this down here to split up topics
I agree that testing with unseen points is a bit tricky, but it's not super hard given that we control the input data and it's not too bad for us to either only include hand-written tests or to do some geometry to ensure our unseen pointd will always be classified correctly by a valid linear boundary.
I was thinking about this, and it appears it is redundant to test unseen points which are guaranteed to be on the correct side of any possible decision boundary for a population. That's because there are two hyperplanes which define the limits of a cone of possibility, and each of these limits can be defined by two points, one from each class. These two points work like support vectors, and are robust to any additional (guaranteed valid) unseen points. This means only the support vectors need to be classified correctly to verify that the hyperplane is a valid decision boundary, and this is already done under the current testing.
For this reason I think it would be best to leave out testing of unseen points, since, for students encountering this for the first time, it could give the unintended impression that Perceptron will always correctly classify unseen points.
Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I understand your intentions, but I want to check if I'm getting everything:
- Confused students won't be able to encode the seen points in the perceptron and pass the tests.
I can possibly understand this two ways, so I'll take a stab at both:
- Do you mean students won't have seen-points to use locally to debug? If so, the first set of fixed tests have the points and labels shown explicitly.
- Or, do you mean they might hard code the answers into their function in order to pass the tests (i.e. 'cheat')? If so, the pseudorandom tests don't explicitly show the seen points, and someone would have to hard code over 1200 points across 40+ tests to pass, which, while doable, seems unlikely (also couldn't they just look at other students' solutions after submitting once?).
We can have unseen points without producing ambiguous tests by picking points that are geometrically guaranteed...
This actually still leaves an ambiguity. If a student presents an incorrect decision boundary (DB), the unseen point is not guaranteed to be either correctly or incorrectly classified. It all depends on the particular DB. Furthermore, if the point happens to be classified correctly, it can give the impression that the DB is correct instead of properly relaying the actual information of being just not incorrect, thereby misleading the student.
Just a note: the reason I said that any test that can be done with unseen points can be done as well or better with seen-points is that correct classification of a maximum of four special points (support vectors) out of the seen-points is both necessary and sufficient to tell if the DB is 100% correct or not. Any other test is technically deficient on its own, and deficiency introduces ambiguity. On the other hand, any other in conjunction with classifying the support vectors is superfluous. However, finding them is a non-trivial task, so every point is exhaustively classified in testing (to make sure they are found and tested).
Another option would be to train on and classify larger datasets and assert that the accuracy is above some threshold that it is either impossible or implausible for a correct implementation to miss.
I believe this is what the pseudorandom test set does and accuracy is required to be 100%. Is there something further that I'm missing?
I think it's fine to just call both functions for every test.
I believe this adds unnecessary confusion when there is a failed test as to which function is causing the failure.
- the tests will make more sense to students who read them.
At this point, even if this is the case, the added possibility of confusion created through various ambiguities, and/or the difficulties in trying to mitigate them, make including these features seem to be not worth it. I don't mind including ambiguity in exercises, since it provides 'real world' experience, but it depends on the audience. From what I gather from your concern in this point, you are looking out for less experienced students, so I would presume less ambiguity would be better?
E.g. for linearly separable datasets, I think the centroid of any triangle drawn from points of the same set will be classified correctly by any correct implementation.
Thanks! I was trying to think of the simple way you had mentioned, but I had lost the trees for the forest :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that it's necessary to test the training and classification separately, though we can choose to do so. I think it's fine to just call both functions for every test, as shown above.
Sorry, I just wanted to clarify, I am on board with a classification function if it's tested separately before being used in conjunction with the perceptron function.
It's really the unseen points I'm most hesitant to accept. I think they would be very appropriate in the case of SVM, but the nature of Perceptron just seems to make them more of a bane than a benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your belief that unseen points will cause ambiguity (test failures for valid decision boundaries) is making you think that they're a big problem.
But, as I hope the triangle centroid thing shows, we can generate unseen points with zero chance of them causing false test failures.
If you don't believe the triangle thing, then maybe you can believe that, for linearly separable datasets, any point contained by the convex hull of the points of one label will be given that label by any linear boundary-based classifier that has zero error on the training set. For a linear boundary and a 2D space that one should be provable geometrically with a ruler and some drawings.
Or you can easily show that any point on the line segment between the support vectors for a label must be classified correctly by any valid linear decision boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have failed to unmuddle my concern :) I'm okay with testing to demonstrate how classification of unseen points works, but testing as proposed is too easily confused with testing unseen points to verify the correctness of a decision boundary. My thoughts on why testing unseen points to verify correctness is undesirable follow:
I'm with you on the centroid idea (which I quite liked) and familiar with the other arguments you've given, so I'm not concerned about test failures for valid decision boundaries. The minor issue I have in the case of valid decision boundaries is that testing with unseen points to validate a decision boundary is simply redundant. Why it is redundant can be gleaned from your arguments on why and how we can choose correct unseen points: namely they are chosen entirely in reference to seen points.
One issue that concerns me involves invalid decision boundaries. With an invalid decision boundary, there is no guarantee how an unseen point will be classified (just as there is no guarantee how any point will be classified). There are two cases:
- The unseen point is classified incorrectly and there is a test failure. The student is alerted to the fact that their decision boundary is incorrect. This I'm okay with, although it is still redundant.
- The unseen point is classified correctly and the test passes. The student is not alerted to the fact their decision boundary is incorrect and either receives conflicting information from other tests or may indeed end up believing that their decision boundary is correct. This is the first aspect I dislike.
The other issue that concerns me is on a more psychological note: People often take partial information, over-generalize and create false impressions (e.g. stereotypes/prejudices/etc). That's how some of the more naive or inattentive students, seeing testing on unseen points in this way, could walk away with the mistaken impression that Perceptron can correctly classify any unseen point (not just carefully selected points). This is the second aspect I dislike.
So, with what the possible outcomes are from testing unseen points, redundant good info or unnecessary potentially misleading info, and the possible misconception about Perceptron that can be conveyed simply by including the testing, I don't feel the testing of unseen points in the proposed manner brings enough to the table to be included.
To boil it all down: I feel that 'testing' unseen points to demonstrate how classification works has value, but I think this is too easily confused with testing to verify correctness of the decision boundary. I currently can't think of tests which can eradicate my concerns above. However, there are related ideas we could demonstrate.
For example, we could show that the introduction of a new point to the data will (in all likelihood) result in a different decision boundary. This doesn't involve classification of unseen points though. As an aside: with SVM, we could introduce an element of a demonstration of unseen point classification switching with a before and after approach, but it just doesn't work with Perceptron because of the non-uniqueness of the decision boundary (i.e. we can't be guaranteed a switch in classification).
Do you have any other ideas on ways to demonstration classification of unseen points?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cost vs benefit analysis that is going on in my head is the following:
- Testing unseen points to verify correctness is both redundant and carries downsides
- Testing unseen points to demonstrate classification is desirable, but unnecessary to satisfy the aim of the exercise.
The exercise is first and foremost meant to have students learn about the Perceptron algorithm. Including a demonstration of how classification works would be a bonus, but I just am not sure this type of testing environment properly supports demonstration of this kind.
|
||
Now that we know how to tell which side of the hyperplane an object lies on, we can look at how perceptron updates a hyperplane. If an object is on the correct side of the hyperplane, no update is performed on the weights. However, if we find an object on the wrong side, the update rule for the weights is: | ||
|
||
$$[w_0', w_1', w_2'] = [w_0 \pm l_{class}, w_1 \pm x*l_{class}, w_2 \pm y*l_{class}]$$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This equation and the next two paragraphs are not clear enough, imo.
The update rule is where the magic happens, so it would be nice to explain this better and try to give the student some insight into how the rule works.
The
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got rid of the plus/minus stuff, since I agree it's confusing/distracting and it really isn't terribly important anyway.
The update rule is where the magic happens, so it would be nice to explain this better and try to give the student some insight into how the rule works.
Were you thinking along the lines of more explanation of how the update makes the line move? Or were you thinking more clarity on how to implement it? I had left the implementation details a little bit vague intentionally (stopping short of pseudocode), hoping to leave room for the student to be creative or do some research for their implementation. However, I'm aware that some students are at a learning level where this is frustrating. Let me know you thoughts :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I have time and energy to attempt the exercise myself during the week and maybe that will help me clarify my ideas :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$$[w_0', w_1', w_2'] = [w_0 \pm l_{class}, w_1 \pm xl_{class}, w_2 \pm yl_{class}]$$
We also don't support rendering equations as far as I can recall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also don't support rendering equations as far as I can recall.
Fixed.
Better blurb
Delete auto-generated comment as these were manually created
Updates to instructions including clarifications and an added gif from https://commons.wikimedia.org/wiki/File:Perceptron_training_without_bias.gif
using Test, Random | ||
include("perceptron.jl") | ||
|
||
function runtestset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should include some tests with manually specified input data of just a few points to make this more approachable and also as good practice (e.g. first few tests could be spaces with just 2-4 points placed manually).
Agree with this
Co-authored-by: Victor Goff <[email protected]>
Sorry for radio silence. I've not had the energy to do the exercise yet. Maybe next week. If anyone else fancies giving it a go and posting their comments about it then that's welcome and about the only thing that will speed up the process. Otherwise, no worries, I'll get to it when I can. |
No worries! I'm in no hurry and have been learning a fair bit, which leaves me quite content :) I've updated a few files:
After a lot of contemplation on the other suggested tests, I've come to the conclusion that the potential collateral effects are too detrimental in this particular learning environment for me to justify adding them. Of course, discussion on this is still welcome. I'll look at splitting the |
Glad to hear you're content and those changes sound fine :) For the unseen points, I think your complaint is that we can pick valid unseen points but we can't ensure that an invalid decision boundary or implementation will fail those tests. Whereas if we test only the support vectors then a decision boundary that passes is by definition valid. Is that your position? I agree that that is true. I don't think it's a problem, though. It's extremely common for an incorrect solution to pass some tests and not others. My suggestion is that we test the support vectors (or all the labeled points, whatever) and also some unseen points. I can't help much regarding the introduction/instructions split. CLI users probably get either two files or the two files are concatenated into a single README.md |
The CLI will concatenate them |
This is one worry. However, I don't worry that the student will get past all the other tests, but more about what implicit information is being given to the student by using this testing. For example, my biggest concern is that some (inattentive/naive) students, seeing testing of this type, can form the misconception that Perceptron will always classify unseen points correctly. Don't get me wrong, I like to induce misconceptions when teaching in person, but that's because dispelling them often produces a memorable lesson. However, here, we don't have the luxury of noticing or dispelling misconceptions. That's why I think it's better to mitigate the possibility of their inclusion. Another possible misconception that could come from the proposed In short, I think that the other tests are overstepping the bounds of what can be reliably accomplished with this type of exercise on this platform in that they are getting into where we would need to test for understanding. |
I've realized that I haven't detailed the logic of the current testing sequence, which is the following progression:
Is there something significant the proposed tests add which warrants all the drawbacks or which couldn't be remedied by further testing of the sort above? |
Cleaned up obsolete functionality, and tried to include reporting of students' decision boundaries in testset titles, for easier debugging
Cool, I just don't think this is likely and it's not clear to me why this would be worse than students getting confused about why we only ever test with the same data we provided as input.
These sound good and well-thought out :) I would like unseen points to be tested as well, but whatever, let's leave that disagreement until the rest of the exercise is reviewed. |
Maybe there is a slight source of confusion in using an output as a proxy to test for the correctness of an algorithm versus using an output as a proxy to test the accuracy of an algorithm? The former is the intended scope of this exercise, while I believe testing unseen points does the latter. A quick analogy to show the difference: We can test the Alphametics algorithm by inputting a puzzle, then taking the output dictionary and testing it against unseen puzzles which have the same encoding as the original puzzle. This test is redundant in the same way as the proposed testing with unseen points*. I believe the two testing paradigms aren't usually conflated on Exercism because most (if not all) the current exercises are not ML related, and testing for accuracy is the paradigm most associated with ML. While Perceptron is used in ML, I don't want to mix the two paradigms here because correctness should be shown before testing accuracy, and trying to do the two together introduces a level of complexity which I don't think is beneficial at this stage. We could always make another exercise which tests Perceptron accuracy, but it would likely involve (or be entirely centered around) hyperparameter tuning, like an actual use case in ML. To briefly highlight the key differences in the testing paradigms:
I'll be happy to respond to your other comments after the rest of the exercise is reviewed as you suggested, but I just wanted leave this for your consideration because I feel it nicely encapsulates the underlying source of many of my misgivings. *Most of the italicized word have fairly obvious analogues in the Perceptron case, with the possible exception of encoding, whose analogue is support vectors. |
Reduced visibility/readability of code containing possible spoilers for students.
Deleted redundant entry for binary-search-tree
I had made a new practice exercise from the library (binary-search-tree) and thought I would be able to open a new PR with it, but it got included in this one. I'm not sure if I've done something wrong or if this is expected. Let me know if there's anything I should do about it, but I'd like to explain the new exercise, since the tests from Also, it looks like there are checks that are not successful due to three tests in Circular Buffer failing in the |
@@ -0,0 +1,2 @@ | |||
# Create a (Mutable) Struct BinarySearchTree, which has fields: data, left, right | |||
# Also write a sort method, which returns a sorted array of the elements in a BinarySearchTree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Also write a sort method, which returns a sorted array of the elements in a BinarySearchTree | |
# Also write a sort method, which returns a sorted array of the elements in a BinarySearchTree. | |
It seems that this should be sentences in a comment, but surely the EOL is important as well for the last line of this stub.
Sorry, I feel like I've made a mess of things here. Closing |
Another option is if you have a commit that you can go back to that you can identify as the point at which you feel things are where they should be, you can reset hard and force push to your own branch. That can bring you back to where you have less of that feeling, while keeping the work that you feel has some potential. It might be better than opening a new PR, and can keep the communication in one place (those that are following alone, etc) will not have to re-subscribe to the work being done that they may be interested in. Of course, it is up to you how you would want to manage it. I appreciate all the work you have done on this, it is not going unnoticed. |
I still think this is good, I've just been too busy to properly review (and
now on holiday!!).
Please don't delete the branch: I would like the option of working on this
even if you decide you don't want to spend any more time on it :)
…On Wed, 13 Mar 2024, 20:45 Victor Goff, ***@***.***> wrote:
Sorry, I feel like I've made a mess of things here. Closing
Another option is if you have a commit that you can go back to that you
can identify as the point at which you feel things are where they should
be, you can reset hard and force push to your own branch. That can bring
you back to where you have less of that feeling, while keeping the work
that you feel has some potential.
It might be better than opening a new PR, and can keep the communication
in one place (those that are following alone, etc) will not have to
re-subscribe to the work being done that they may be interested in.
Of course, it is up to you how you would want to manage it.
—
Reply to this email directly, view it on GitHub
<#678 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNZA6P73GZJ3X23RKWJCJ3YYCNFPAVCNFSM6AAAAABAOTPF5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJVGM2TGNZRHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@kotp Thanks for the suggestion and comments! I'll consider how is best to move forward. I just felt the PR was becoming unwieldy because of the new exercise (which may also be able to use a good bit of feedback). I'll keep the branch, as @cmcaine suggests, so anyone can use whatever I've done in the meantime. I don't know if there is anything more I could add to the Perceptron exercise, so feel free to take it and modify it as you please. Likewise, the Binary Search Tree exercise is also available if anyone wants it. I feel it is rather 'bare-bones' since I merely followed what was in |
Just an update: I've opened an issue with a list of the exercises I have finished making and a link to the branch in order to make more people aware of the available code. Hopefully this is an acceptable solution. |
This is a first attempt at contributing an exercise, so there may be some deficiencies. I haven't received a reply to the issue I opened, so I will leave this as a PR in case it's of interest to Exercism.