-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy/non-lazy concat() in PersistentHashSet versus PersistentVector #49
Comments
Thanks for your interest in Paguro and for your insightful comments. It looks like I implemented If you're familiar with Clojure, Transformable/Xform is kind of like the Sequence Abstraction. As you figured out, by default, all methods on Xform produce another Xform, until you call one of the terminator methods like I guess it would be purer and less surprising to get rid of that implementation and make you call For your purposes, maybe we should have the Xform remember the source? Have a "render to same form as source collection?" Not sure I'd want that to render to Arrays, if an Xform was started with an In your use case, how are you making use of the Sets and Lists later? Do you check what they are and treat one as a set and the other as a List? Or use them both as Iterables or Transformables, or what? |
Thanks for your prompt response and for shedding some light on why there are different implementations for the two types. I definitely see the different complexity for implementing Still, it can be very surprising to receive an eager, shape-retaining result for one type and a lazy, shape-discarding result for another type, when calling a method of a common interface. Since A common problem with most "fluent iterable" libraries is that the original "shape" or type of the initial data structure is lost on the first transformation, and all subsequent steps only deal with a generic Of course, in plain Java it is notoriously difficult to express the concept that a method returns the same type as the object that it was called on (and will typically require some sort of higher-kinded type library). I'm still too new to Paguro to make any real design recommendations here, but it would be nice to have some consistency (i.e., either all implementations of an interface method are lazy, returning a transformation, or they are all eager, returning a new object of the original type). In the end, this might require supporting two sets of methods, a set of efficient transformer-based methods, and another set of possibly less efficient, shape-retaining methods. I like your suggestion of having the Xform remember the original shape and have some sort of |
Random thoughts: New theory: If an abstraction is going to leak somewhere, you might as well put the leak where you only find it if you're doing something you shouldn't be with it. Like trying to prepend or precat to a PersistentVector. That way, I could implement concat on everything, and precat on everything except PersistentVector. That would let me fix your issue without making anything more surprising for anyone else? This project includes an Equator and ComparisonContext, both of which allow you to use arbitrary concepts of equality and comparison that are not tied to the default implementation in .equals() and .hashCode(). This project also includes (an experimental) RuntimeTypes, which could be used to preserve aspects of the original type for an Xform. Right now it's only used for the toString() implementation of the OneOf... union type classes, but it could be used to preserve type information for any reason. |
Version 3.10.2 is released (may take a few hours to be visible). Let me know if that solves your issue. I implemented .concat() and .precat() on unordered sets and maps (and .precat() on RrbTree). It's sensible and cheap to do that. I didn't implement .precat() or .concat() on ordered sets/maps because that would suggest an ordering other than the one declared by the comparator. So if you call those methods on sorted maps/sets, you're still going to get an Xform. Ditto calling .precat() on ImList (but on RrbTree, it's fine to call that). Hmm... Yeah, now I'm having misgivings.
I think I'm going to back out some of these changes. Sorry! |
Version 3.10.3 is released, fixing my crazy stuff from the previous round. This is how I would solve your issue: Make a wrapper class to abstract the type of the underlying collection. This is overly complicated in several ways, but hopefully it's helpful: import org.jetbrains.annotations.NotNull;
import org.junit.jupiter.api.Test;
import org.organicdesign.fp.collections.ImList;
import org.organicdesign.fp.collections.ImSet;
import org.organicdesign.fp.collections.PersistentHashSet;
import org.organicdesign.fp.collections.PersistentVector;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.organicdesign.fp.FunctionUtils.stringify;
public class OneOf2Test {
static class Set_List<T>
extends OneOf2<ImSet<T>, ImList<T>>
implements Iterable<T>
{
private final Class<T> clazz;
// Constructor
@SuppressWarnings("unchecked")
private Set_List(@NotNull Object o, int n, Class<T> claz) {
// Casting thanks to https://stackoverflow.com/a/1080525/1128668
super(o,
(Class<ImSet<T>>)(Class<?>) ImSet.class,
(Class<ImList<T>>)(Class<?>) ImList.class,
n);
clazz = claz;
}
@Override
public @NotNull String toString() {
return stringify(item) + ":" +
"ImSet<" + clazz.getSimpleName() + ">|" +
"ImList<" + clazz.getSimpleName() + ">";
}
public @NotNull Set_List<T> add(T item) {
return match(set -> new Set_List<>(set.put(item), 0, clazz),
list -> new Set_List<>(list.append(item), 1, clazz));
}
@Override
public @NotNull Iterator<T> iterator() {
return match(set -> set,
list -> list).iterator();
}
// Alternate implementation:
// @Override
// @SuppressWarnings("unchecked")
// public @NotNull Iterator<T> iterator() {
// return ((Iterable<T>) item).iterator();
// }
// Factory method
public static <U> @NotNull Set_List<U> createSet(Class<U> clazz) {
return new Set_List<>(PersistentHashSet.empty(), 0, clazz);
}
// Factory method
public static <U> @NotNull Set_List<U> createList(Class<U> clazz) {
return new Set_List<>(PersistentVector.empty(), 1, clazz);
}
}
@Test
public void testStuff() {
Set_List<Integer> operations1 = Set_List.createSet(Integer.class);
Set_List<Integer> operations2 = Set_List.createList(Integer.class);
operations1 = operations1.add(3);
operations1 = operations1.add(5);
operations1 = operations1.add(5);
operations2 = operations2.add(3);
operations2 = operations2.add(5);
operations2 = operations2.add(5);
assertEquals("PersistentHashSet(3,5):ImSet<Integer>|ImList<Integer>", operations1.toString());
assertEquals("PersistentVector(3,5,5):ImSet<Integer>|ImList<Integer>", operations2.toString());
}
} You can make your wrapper implement whatever common interface is most helpful to you. Here I implemented Iterator. There's a lot of types in this example. I'm sure you can do something much simpler if you don't need .equals() and .hashCode() and can hardcode some types. I never even tried to make a union type with generics like this before. |
Thanks for the quick fix, Glen! Yes, this looks like it will solve my issue (haven't fully coded everything yet, but I'll let know if I come across anything that doesn't work as expected). Thank you so much! |
I'm using Paguro in a project that deals with commutative and non-commutative operations, and I use
PersistentHashSet
for commutative ones and aPersistentVector
for non-commutative ones, to distinguish between ordered and unordered elements. Elements are added to these data structures via theconcat(Iterable)
method. I noticed that for aPersistentVector
this results (as expected) in a newPersistentVector
object, but thePersistentHashSet
I receive an intermediate transformation operator (Xform$AppendIterDesc
) that (lazily, I assume) represents the concatenation. Notably, the result is not aPersistentHashSet
.Is this intentionally designed like this? If so, it might be helpful to include the rationale for this divergent behavior somewhere in the documentation. I'm using
concat
because it seems to be the only API method that allows me to append to either data structure without knowing what the underlying implementation is. Is there some other method that would allow for that while returning an object of the same type as the original data structure?The text was updated successfully, but these errors were encountered: