Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if map contains key then error else add key/value to map #15

Closed
cal101 opened this issue Mar 29, 2017 · 9 comments
Closed

if map contains key then error else add key/value to map #15

cal101 opened this issue Mar 29, 2017 · 9 comments

Comments

@cal101
Copy link

cal101 commented Mar 29, 2017

Moin,

while implementing a primary index with ImMap i fell fack into mutable habits:

if(contains) then error else add

in my case the bias is on keys being unique and so i can just add the new key and see if the map
instance changes.

That assumes that the following holds true:

map.assoc(existingKey, v) == map

If that holds true i wonder if assocIfAbsent is useful to add.
May take a value or a supplier.

Opinions?

Cal

@GlenKPeterson
Copy link
Owner

One thought is that entry() returns a Paguro Option which supports pattern matching (not referring to regular expressions), so you can:

map = map.entry(someKey)
         .patMat( kvPair -> map,
                  () -> map.assoc(someKey, v) )

You are using == (I assume in Java, because it means something different in Kotlin) which is referential "equality" with Objects (and value equality with primitives). You're really asking, "Do these two objects share the same address in memory?" as a test for whether they are equal or not. This is the default implementation of Object.equals(). Actually, assoc() uses this same definition of equality:
https://github.com/GlenKPeterson/Paguro/blob/master/src/main/java/org/organicdesign/fp/collections/PersistentHashMap.java#L261

This will work for Strings that are compiled into the program (because the compiler de-duplicates Strings), enums, and primitives (which are actually compared by value because they have no address). But you can run into problems with any other kind of object. For instance, new Integer(543) == new Integer(543) is false (they are at different locations in memory) even though new Integer(543).equals(new Integer(543)) is true.

Rich Hickey used referential equality in PersistentHashMap as a speed and memory optimization. It works for that, but it's not a substitute for a true equality test based on the important values in an object.

I'm guessing a bit here. If I'm not answering your question, you might try writing a function (even a poor one) that does what you want. Submit that in your response. Just so that I can see the context of what you're doing. I don't mean to extend PersistentHashMap, just write a function that takes a map and returns either a Map (if you want to ignore duplicates) or an Or<Map,Exception> (if you want to do functional error handling).

@cal101
Copy link
Author

cal101 commented Apr 2, 2017

I think something like

Map<K,V> put(K key, U value, BiFunction<? super V,? super U,? extends V> merge)

as defined in javaslang Map interface

http://www.javadoc.io/doc/io.javaslang/javaslang/2.1.0-alpha

or

public TreeMap<K,V> update(K k, F<V,V> f, V v)

From

http://www.functionaljava.org/javadoc/4.7/functionaljava/index.html

would allow what I meant.

@GlenKPeterson
Copy link
Owner

GlenKPeterson commented Apr 2, 2017

Thank you for taking the time to explain this. I'm including direct links to your specific examples just so I can reference them very quickly in the future.

JavaSlang:
http://static.javadoc.io/io.javaslang/javaslang/2.1.0-alpha/javaslang/collection/Map.html#put-K-U-java.util.function.BiFunction-

FunctionalJava:
http://www.functionaljava.org/javadoc/4.7/functionaljava/fj/data/TreeMap.html#update-K-fj.F-V-

Let me see if I understand this correctly. Both methods return a new Map. If the old map contains the key, a function is executed and the result stored as the new value in the map (the result could be the old value unchanged). This function takes the old value associated with the key (and maybe the new value as well). Is that correct?

How are you using this feature? You mentioned assocIfAbsent. If I were to implement that in JavaSlang, it would look like this:

map = map.put(key, newVal, (oldVal, newVal) -> oldVal);

In FunctionalJava:

map = map.update(key, newVal, (oldVal) -> oldVal);

In Paguro:

// get a Paguro Option of the k/v pair associated with this key.
map = map.entry(key)
         // if isSome() return the map unchanged.
         .patMat( oldKV -> map,
                  // if isNone(), assoc the new value.
                  () -> map.assoc(key, newVal) )

ML, Scala, and Haskell allow matching based on types. It's a little bit like destructuring in a way. Paguro's Option is intended to be similar to that. OneOf2 works the same way and it will probably be joined by OneOf3, OneOf4, ... in the future. Those might be renamed Union2, Union3, etc. because they can function like Union types for Java. But I digress...

Your other example was if(contains) then error else add. Maybe that's what had me thinking about Option. If by error you mean Exception:

JavaSlang:

map = map.put(key, newVal, (oldVal, newVal) -> throw new RuntimeException("contains"));

FunctionalJava:

map = map.update(key, newVal, (oldVal) -> throw new RuntimeException("contains"));

Paguro:

map = map.entry(key)
         .patMat( oldKV -> throw new RuntimeException("contains"),
                  () -> map.assoc(key, newVal) )

So, FunctionalJava is the briefest and Paguro is the most verbose for this use case. I haven't personally seen a lot of use for assocIfAbsent(key, value). I usually am fine with letting the map take the last value that was put in it. So it doesn't seem a strong candidate for a convenience method. But perhaps the way you use maps will convince me otherwise?

Hmm... You know Paguro's ImList has a reverse()? If you reverse your input, then let the map take the last value, the result is the same as what I think you're asking for.

ImList<User> users = ...; // however you get this data

ImMap<Long,User> usersById =
        users.reverse()
             .toImMap(u -> tup(u.id, u));

By reversing the input, the earliest values effectively overwrite the later ones.

I still don't feel like I completely understand your use model. I hope this is helpful.

@cal101
Copy link
Author

cal101 commented Apr 3, 2017

I am using it like a database primary index. First key wins, exception on duplicates.
Your RuntimeException example fits it.
Without such a method you typically have to do 2 tree search walks:

if (!map.containsKey(key)) // first walk
map = map.assoc(key, value) // second walk with modification

For HashMap that was fixed with putIfAbsent.

The difference between your patMap example is the extra option object and the for me strange
means of using something that looks like an option return value to add to the map i got it from.
The paguro map misses something like putIfAbsent if you call it patMap or whatever.
Patmap as a new method on the map is even better than the other examples because the computation of the value can be lazy.

Result: I want ImMap.patMap ;-)

@GlenKPeterson
Copy link
Owner

Paguro generally accommodates what you are asking for with patMat ("pattern matching"). I understand that you don't like creating a tuple, or the syntax of accessing the map inside the match. You are asking for a convenience method for your specific use case which I think is somewhat rare. Unless/until more people ask for it, I can't see putting it ahead of the other development priorities listed here:
https://github.com/GlenKPeterson/Paguro#future-development-priorities-as-of-2016-11-13

So, "maybe someday" is the current resolution of your issue. I'm just going to rename this issue and leave it open for a while to see if more people vote for it.

@GlenKPeterson GlenKPeterson changed the title assocIfAbsent useful? Apr 20, 2017
@GlenKPeterson GlenKPeterson changed the title Add assocIfAbsent(K key, Function1<V,V> applyIfKeyExists, V newValue) to ImMap<K,V> Apr 20, 2017
@GlenKPeterson
Copy link
Owner

GlenKPeterson commented May 16, 2017

I think I have new insight into what you might be doing. You must have some source of data to add to this map. Let's assume it's from a java.util.ArrayList, called someList. If you didn't care about duplicates, transforming to a map might look like this:

ImMap<String,Integer> foo(ArrayList<String> someList,
                          Fn1<String,Integer> f) {

    return xform(someList).toImMap(x -> tup(x, f.apply(x)));
}

Maybe you would add .filter(), .map(), etc. before .toImMap(), but I'm leaving that out of this example for simplicity. The problem for you is it doesn't stop when a duplicate key is encountered. You can do that with an exception and a fold today (sorry I didn't think of this before):

xform(someList)
        .fold(map(),
              (accum, s) -> accum.entry(s)
                                 .match(exists -> {throw new Exception("Duplicate");},
                                        () -> accum.assoc(s, f.apply(s))));

map() creates the new empty immutable map to be our accumulator. fold then loops, passing the map and the next item from the source data to the reducer function. In the reducer, it checks if the map already contains that key and if so, throws an exception. If not, it returns the accumulated map with the new key added. When all items are processed, fold yields the accumulated map.

Yay! But many functional programmers don't like exceptions. Especially to cause loop termination.

If you want to return an Or<GOOD,BAD> instead of throwing an exception, there is another fold method that terminates early, but doesn't have quite the signature that you need. If you like this, I could probably change the signature of that method in Paguro 3.0 which will have some other small breaking changes. Here's the code that you'd write. Notice the outer function now returns an Or instead of an ImMap directly:

Or<ImMap<String,Integer>,String> baz(ArrayList<String> someList,
                                     Fn1<String,Integer> f) {
    return xform(someList)
            .foldUntil(map(),
                       (accum, s) -> accum.containsKey(s) ? null
                                                          : "Duplicate key",
                       (accum, s) -> accum.assoc(s, f.apply(s)));
}

Ahh... purely functional error handling.

The first (accum, s) -> function is a terminate-early function that actually gets run for each item in the source data immediately before the reducer function. The second (accum, s) -> function is the standard reducer function. As long as the terminate-early function returns null, the loop continues and will eventually yield an Or.Good of the appropriate type. If the terminate-early function ever returns a non-null, the loop stops and the fold immediately returns an Or.Bad consisting of whatever the terminate-early function returned. Here's the signature for the proposed fold-with-terminate-early function:

<G,B> Or<G,B> foldUntil(G ident,
                        Fn2<? super G,? super T,B> terminator,
                        Fn2<? super G,? super T,G> reducer);

The G means Good, B means Bad, T is the type of the input items from the collection you're processing. This will work and be fast if it solves your issue.

If you actually had to throw the exception later for some reason, you could:

ImMap<String,Integer> doTheBaz(ArrayList<String> someList) throws Exception {
    return baz(someList, String::length)
            .match(good -> good,
                   bad -> {throw new Exception(bad);});
}

I need to think about all the implications, but I like it and already coded it into Paguro 3.0. I think it will stay there, so please let me know if it's not right.

Questions

Q: Why are you returning null? I thought you didn't like null?
A: Wrapping everything in Or.good() and unwrapping it on every iteration is slow and uses memory. The early-terminating fold-left is used internally in some performance critical places. It also complicates the method signature, the implementation, and usage by end users. Finally, null is only really evil when it's returned instead of an empty collection.

Q: What else is in 3.0? When will it be out?
A: An RRB Tree. Everything deprecated in 2.x will be removed to keep jar size small. Possibly a new layer of interfaces. All Function1, Function2, etc. interfaces are renamed to Fn1, Fn2, etc. foldLeft() renamed to fold(). It will be released when it's ready, but a release candidate might be available by end of June?

Thanks for your interest. I hope you can continue to enjoy using Paguro!

@GlenKPeterson GlenKPeterson changed the title Add ImMap.assocIfAbsent(K key, Function1<V,V> howToUpdateValue, V newValIfAbsent) May 16, 2017
@GlenKPeterson
Copy link
Owner

I think you have given up, which is OK. But for the record, this is now a part of Paguro 3.0 here:
https://github.com/GlenKPeterson/Paguro/blob/2016-05-22_RRB-Tree/src/main/java/org/organicdesign/fp/xform/Transformable.java#L106
I'll close this ticket when that's released. If you need it sooner, I can back-port it into 2.x.

@cal101
Copy link
Author

cal101 commented May 22, 2017

No need to backport.
Has nothing to do with my requirement or question as far as I see.
No need to discuss that any further.

@GlenKPeterson
Copy link
Owner

I fixed your issue as I understood it, but maybe not as you understood it. Not sure whether that means this should be closed as "fixed" or "wont fix" but it sounds like time to close it either way. I feel you have helped the development of Paguro and I thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants