Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could new() or init(_ count:Int) be done faster? #11

Open
kechan opened this issue Jan 22, 2019 · 2 comments
Open

Could new() or init(_ count:Int) be done faster? #11

kechan opened this issue Jan 22, 2019 · 2 comments

Comments

@kechan
Copy link

kechan commented Jan 22, 2019

I realized when investigating method chaining (e.g. sub_) in my other issue, a significant amount of time is spent in instantiating Array that contains 0, and very often, they may be intermediate result that never see the light of day, e.g.:

let v123 = v1 - v2 - v3 (i.e. result for v1 - v2)

The amount of time is much longer than using sub_() which won't create new array.

I did a quick benchmark (release):

    benchmark(title: "array.create", num_trials: 10) {
        let a = [Float](repeating: 0, count: len)
    }
    
    benchmark(title: "array.create faster", num_trials: 10) {
        let p = UnsafeMutablePointer<Float>.allocate(capacity: len)
    }

Results:
array.create: 9.340 ms
array.create faster: 0.005 ms

Thats just a great speed diff! I highly suspect one can do away with just the ptr for intermediate result, and wrap it in Array again when returned "outside" of BaseMath.

E.g. if you use pure Accelerate API where a dest ptr is specified, that dest ptr only need .allocate and zeroing isn't necessary.

Just a thought for possible optimizing. This library is looking very good. As an experiment, I am able to create my own library that depends BaseMath but would use Accelerate API in place of explicit pt loop, and I really like the concise clean syntax that look nice like swift code. The only think that bother me is the init that fill them with zero, which is not necessary for Accelerate (and probably elsewhere).

@kechan
Copy link
Author

kechan commented Jan 22, 2019

I noticed AlignedStorage may be without this problem. Here:

    benchmark(title: "AlignedStorage.create") {
        let v1 = AlignedStorage<Float>(len)
    }

AlignedStorage.create: 0.007 ms

This is really nice. And I am still able to do arithmetic and math with this object (need more test), as well as diverting it to use Accelerate in my own lib.

However, my previous point may still be valid, if one has to use Swift array or simply not aware of AlignedStorage.

Note: for CoreML (apple machine learning), there's a type call MLMultiArray, which is a multi-dimensional array but i think it is row-major aligned underneath. You can obtain a UnsafeMutableBufferPointer and then instantiate an AlignedStorage with it, and off you go running nontrivial math algorithm (and with Accelerate if thats profiled to be faster).

Thanks a lot. I had wanted something like BaseMath for a while. People did blog about it but i ain't aware of any concrete/complete project till this one.

@kechan
Copy link
Author

kechan commented Jan 24, 2019

Update: someone told me about these. I will just dump it here for future investigation:

var a = [Float]()
a.reserveCapacity(len)  <---- NB: a.count will still be 0.

let a = ContiguousArray<Float>(repeating: 0, count: len)     // this isn't any much faster.

var a = ContiguousArray<Float>()
a.reserveCapacity(len)       // NB: a.count will still be 0. 

a.count worries me, does it mean it isn't doing any eager alloc or what? don't know enuf swift here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant