Skip to content

All combinations #51

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jan 14, 2021
Merged

All combinations #51

merged 21 commits into from
Jan 14, 2021

Conversation

mdznr
Copy link
Contributor

@mdznr mdznr commented Dec 18, 2020

Description

This addition adds the ability to easily get combinations of a range of sizes, instead of the previous behavior of a single, fixed size, k.

Detailed Design

The combinations(ofCount:) function that takes a RangeExpression is added in an extension of Collection.

extension Collection {
  public func combinations<R: RangeExpression>(
    ofCount kRange: R
  ) -> Combinations<Self> where R.Bound == Int {
    return Combinations(self, kRange: kRange)
  }
}

If the upper bound of the range exceeds the length of the collection, no such combinations of those sizes will be returned, effectively clamping the range to 0...n where n is the length of the collection. The behavior of negative numbers also remains unchanged (asserting).

This also updates the implementation of the existing combinations(ofCount:) function to initialize a Combinations with a range of k...k to provide the same functionality as it did before.

public struct Combinations<Base: Collection> {
  internal init(_ base: Base, k: Int) {
    self.init(base, kRange: k...k)
  }
 }

Complexity

The complexity remains O(1) for initializing Combinations. Accessing the next combination also remains O(1).

Naming

The name combinations(ofCount:) is the same as the original, but reflects the ability to provide multiple counts. Alternatively considered combinations(ofCounts:) and combinations(ofCountsIn range:).

Documentation Plan

The documentation in Guides/Combinations.md has been updated to reflect the new function. Inline code documentation also exists for each of the functions.

Test Plan

  • Adds tests for the count property, which was previously untested
  • Adds tests for the outputs of a range of sizes
  • Adds tests for all valid sizes
  • Adds tests for ranges extending above the size of the collection

Source Impact

The functionality is purely additive. The behavior of the existing function is unchanged.

Checklist

  • I've added at least one test that validates that my change is working, if appropriate
  • I've followed the code style of the rest of the project
  • I've read the Contribution Guidelines
  • I've updated the documentation if necessary

@mdznr mdznr marked this pull request as ready for review December 18, 2020 18:16
@mdznr mdznr requested a review from natecook1000 December 18, 2020 18:18
Copy link
Member

@natecook1000 natecook1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this addition, @mdznr! 👏👏

I'm a big fan of the range-based version, especially that you can pass all kinds of range expressions and it just works. The double plural in the name strikes me as a little odd, but I don't think it's worth changing unless there's something clearly better.

I'm not sold on the no-argument version. Can you explain a bit about your use case for including this? As a user, I would be surprised that the result contained single-element "combinations" and very surprised about the leading zero-element value. I'm inclined to just document the usage in the range-based version, since you can use a range expression to explicitly get the behavior you want — e.g. numbers.combinations(ofCounts: 0...).

: 0
return k.map {
binomial(n: n, k: $0)
}.reduce(0, +)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future optimization — the binomial coefficient is symmetric around N/2 or (N - 1)/2, so up to half the work could be skipped here. Or there might be a way to calculate this directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I’ll play around with a way to do this that maintains readability. We can probably get most of the algorithmic performance benefits by using memoization in binomial.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested adding a small memoization cache (Dictionary) to binomial. To get the count value for Combinations of a Collection of length 26 ("ABCDEFGHIJKLMNOPQRSTUVWXYZ"), using count 1..<26 it cut down the number of calls to binomial from 206 to 128.

I have some ideas on how to further improve performance for this. There is already a special case when the range of k is 0...base.count to use 1 << n (2^n). However, a potential common case is excluding the empty and/or complete set (1..<base.count), which we can easily calculate using 2^n minus 1 (or 2). We can do some work to get the absolute minimum number of computations necessary to calculate a contiguous range of values in the last row of the arithmetic triangle.

Should I move that to a separate, subsequent PR so not to block this one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be done later for sure!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#58

@mdznr mdznr force-pushed the all_combinations branch 3 times, most recently from e36b638 to 85f126f Compare January 5, 2021 20:59
@mdznr
Copy link
Contributor Author

mdznr commented Jan 6, 2021

Thanks for the feedback, @natecook1000 !

The double plural in the name strikes me as a little odd, but I don't think it's worth changing unless there's something clearly better.

I agree that it does sound a bit odd. I’d definitely be up for changing it if anyone has any suggestions. Perhaps something like combinations(ofCountIn kRange: R)?

I'm not sold on the no-argument version. Can you explain a bit about your use case for including this? As a user, I would be surprised that the result contained single-element "combinations" and very surprised about the leading zero-element value. I'm inclined to just document the usage in the range-based version, since you can use a range expression to explicitly get the behavior you want — e.g. numbers.combinations(ofCounts: 0...).

I do see why it seems bit weird at first, but there’s a couple reasons why I have it as such:

  1. It’s consistent with well-known mathematical topics.[1][2]
  2. It works well with two related additions that I may put up in future subsequent PRs:
    i. Bifurcate. This addition would be a function like the standard library’s filter(_:) but returns a tuple of two arrays with the included elements in the first, and the excluded in the second. bifurcate(_:) -> ([E], [E])
    ii. BifurcatedCombinations. This builds off of Combinations and Bifurcate to allow you to easily iterate over all combinations of a collection. If you were building something like a matching-type card game where you draw cards and have to create melds, it is useful to go through all possible combinations, including the empty and complete sets. For example, if you have [7♦, 8♦, 9♦] and you drew [5♦, 5♣] (able to discard/keep any number of them), it would be good to know that you have four options:
    • Keep []; discard [5♦, 5♣]; (to maintain a hand complete with melds)
    • Keep [5♦]; discard [5♣]; (in hopes of drawing 6♦ to create a run of 5-9♦ later)
    • Keep [5♣]; discard [5♦]; (probably not a great strategy in this game)
    • Keep [5♦, 5♣]; discard []; (in hopes of creating a set of 3 or more 5s later)

@mdznr mdznr requested a review from natecook1000 January 6, 2021 18:49
@kylemacomber
Copy link

The double plural in the name strikes me as a little odd, but I don't think it's worth changing unless there's something clearly better.

I agree that it does sound a bit odd. I’d definitely be up for changing it if anyone has any suggestions. Perhaps something like combinations(ofCountIn kRange: R)?

We might be able to get away with just overloading combinations(ofCount:):

  • combinations(ofCount: 3..<6) — "combinations of count 3 up to 6"
  • combinations(ofCount: 3...5) — "combinations of count 3 through 5"
  • combinations(ofCount: 3...) — "combinations of count greater than or equal to 3"
  • etc.

I'm inclined to just document the usage in the range-based version, since you can use a range expression to explicitly get the behavior you want — e.g. numbers.combinations(ofCounts: 0...).

I do see why it seems bit weird at first, but there’s a couple reasons why I have it as such:

  1. It’s consistent with well-known mathematical topics.[1][2]
  2. It works well with two related additions that I may put up in future subsequent PRs:

I think these are some strong points, however I'm not sure they overcomes (for me at least) just how trivial and explicit it is to compose combinations(ofCounts: 0...) on your own. It seems to me that it might be better to wait and see if combinations(ofCounts: 0...) becomes relatively common and idiomatic, and then discuss adding a convenience.

@mdznr
Copy link
Contributor Author

mdznr commented Jan 7, 2021

The double plural in the name strikes me as a little odd, but I don't think it's worth changing unless there's something clearly better.

I agree that it does sound a bit odd. I’d definitely be up for changing it if anyone has any suggestions. Perhaps something like combinations(ofCountIn kRange: R)?

We might be able to get away with just overloading combinations(ofCount:):

I like that idea. However, while having two functions named combinations(ofCount:) compiles, lines of code that try to call it doesn’t compile, unfortunately.

Screen Shot 2021-01-06 at 8 15 37 PM

I'm inclined to just document the usage in the range-based version, since you can use a range expression to explicitly get the behavior you want — e.g. numbers.combinations(ofCounts: 0...).

I do see why it seems bit weird at first, but there’s a couple reasons why I have it as such:

  1. It’s consistent with well-known mathematical topics.[1][2]
  2. It works well with two related additions that I may put up in future subsequent PRs:

I think these are some strong points, however I'm not sure they overcomes (for me at least) just how trivial and explicit it is to compose combinations(ofCounts: 0...) on your own. It seems to me that it might be better to wait and see if combinations(ofCounts: 0...) becomes relatively common and idiomatic, and then discuss adding a convenience.

Sounds good 👍. My original use case was to iterate through all combinations, so that’s where I started. Support for arbitrary ranges was added along the way since it was easy to do.

@mdznr mdznr force-pushed the all_combinations branch from 263c4f4 to a3ec470 Compare January 7, 2021 04:24
@natecook1000
Copy link
Member

We might be able to get away with just overloading combinations(ofCount:):

I like that idea. However, while having two functions named combinations(ofCount:) compiles, lines of code that try to call it doesn’t compile, unfortunately.

That's a great idea, Kyle. @mdznr can you double check your error here? I'm not seeing an ambiguity with that change — maybe you had a stale Algorithms module when compiling.

@natecook1000
Copy link
Member

Also, I've definitely seen requests for bifurcate before, but haven't heard of bifurcatedCombinations. Interesting!

@mdznr
Copy link
Contributor Author

mdznr commented Jan 11, 2021

We might be able to get away with just overloading combinations(ofCount:):

I like that idea. However, while having two functions named combinations(ofCount:) compiles, lines of code that try to call it doesn’t compile, unfortunately.

That's a great idea, Kyle. @mdznr can you double check your error here? I'm not seeing an ambiguity with that change — maybe you had a stale Algorithms module when compiling.

I was surprised to see that it wasn’t working, too. After relaunching Xcode this morning, it seems to work. I’ll go ahead and make that change now.

@mdznr mdznr requested a review from kylemacomber January 11, 2021 18:06
Functions that accept a single integer value still name the parameter as `k`.
@kylemacomber
Copy link

Does it make sense to add the support for ranges to permutations(ofCount:) as well? I'm not sure it needs to be implemented as a part of this PR, if it's a good idea, we could also just open an issue to track it.

@mdznr
Copy link
Contributor Author

mdznr commented Jan 11, 2021

Does it make sense to add the support for ranges to permutations(ofCount:) as well? I'm not sure it needs to be implemented as a part of this PR, if it's a good idea, we could also just open an issue to track it.

I think so. It’ll definitely produce quite a lot of permutations to iterate through, but should be done for parity. If the high-level changes for this PR are considered approved, I’ll begin work on making the changes there, too.

Copy link

@kylemacomber kylemacomber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't go over the code and comments with a fine toothed comb, Nate's more qualified to do that, but at a more superficial level this all looks fantastic! Wonderful work @mdznr!

Copy link
Member

@natecook1000 natecook1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking close, @mdznr! 👏

@mdznr
Copy link
Contributor Author

mdznr commented Jan 11, 2021

Thanks for the feedback, @kylemacomber and @natecook1000 . I think I’ve addressed all the feedback so far. Once this is approved/merged, I’ll create the issues (enhancements) to optimize the performance of count and add parity to Permutations.

@mdznr mdznr mentioned this pull request Jan 11, 2021
4 tasks
Copy link
Member

@natecook1000 natecook1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: 🎉

@natecook1000 natecook1000 merged commit 5b7993c into apple:main Jan 14, 2021
@mdznr mdznr deleted the all_combinations branch January 14, 2021 20:41
kylemacomber pushed a commit that referenced this pull request Feb 12, 2021
Adds the ability to easily get permutations of a range of sizes, like #51 did for Combinations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants