Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normative: specify web reality for Array.prototype.join with cyclic values #1518

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ljharb
Copy link
Member

@ljharb ljharb commented Apr 26, 2019

Fixes #289.

This is an attempt, based on my reading of #289, to specify web reality for Array.prototype.join when the array is cyclic.

I'm using a slot on the current realm to keep a "seen" List, which I hope will avoid the reentrancy issues discussed here.

I have no strong opinion about this PR implementation, so if anyone has any better suggestions I am more than happy to accommodate them! Reviews are welcome.

@ljharb ljharb added normative change Affects behavior required to correctly evaluate some ECMAScript source text web reality labels Apr 26, 2019
@devsnek
Copy link
Member

devsnek commented Apr 26, 2019

can this be annex b?

also can [[Seen]] be an internal slot on %Array.prototype.join% or renamed to like [[ArrayJoinSeen]] (preferably the first one)

@ljharb
Copy link
Member Author

ljharb commented Apr 26, 2019

Sure, i like the idea of storing it on the function instead of the realm. I’ll update that.

As for annex B, i suppose, but generally that’s not just for every web reality change, but instead for the ones that are undesired in the language. What’s the argument for that?

spec.html Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
@ljharb
Copy link
Member Author

ljharb commented Apr 26, 2019

I've updated to store the slot on the function itself; i've used the intrinsic notation in #1376 (if this is ready to land before that one, I'll update to add a new intrinsic for join)

@devsnek
Copy link
Member

devsnek commented Apr 26, 2019

for annex b reasoning, I think it throwing is more reasonable (like JSON.stringify and such) although it was brought up that it's not explicitly set to throw on a circular, but all the implementations would stack overflow if they didn't drop the value to an empty string.

@erights
Copy link

erights commented Apr 26, 2019

Are we sure that this Seen Set cannot be used as a global communications channel?

@littledan
Copy link
Member

I'd prefer that this not be in Annex B. Separation of parts of the specification into Annex B makes it harder to read, and it doesn't make it any more web-compatible to not ship the semantics.

@domenic
Copy link
Member

domenic commented Apr 26, 2019

Strong +1 to working on this; thank you @ljharb for taking up this important work.

Also strong +1 to keeping this in the main spec.

The main question here seem to be how well this matches web reality. @ljharb, do you have a test suite perchance, with a comparison of different engines plus this spec?

@ljharb
Copy link
Member Author

ljharb commented Apr 26, 2019

Not quite yet - I wanted to first get a rough consensus on the needed spec changes, before attempting to write test262 tests (and then gathering their results with eshost).

Quite certainly the only way this PR would be feasible is if it matches web reality; if the tests indicate it doesn't, I'll iterate on both until I've got something that does.

@anba
Copy link
Contributor

anba commented Apr 26, 2019

The proposed change doesn't match web-reality (*) and is missing abrupt completion handling. Was there anything wrong with the patch linked in #289 (comment) that it can't be used? (Apart from the potential issues mentioned in https://gist.github.com/anba/4a154bd7143bf2bab3ef#gistcomment-1671398. 😄 )

And as mentioned in #289 (comment), there are a couple of small differences between all engines when to check for cycles resp. which objects are eligible for the cycle check.

So I'd say the following questions need to be answered first:

  • Is there a single list for Array.prototype.join and Array.prototype.toLocaleString or separate ones? (I think all engines use a single shared list for join and toLocaleString.)
  • When to apply the cycle check (before or after ToString(separator))? (**)
  • Do we want to have a per Realm or per Agent list? (***)

(*) For example it'll print pre,pre,,,post,,post for the following test case, whereas all engines print pre,,,post for that test.

var a = ["pre", null, null, "post"];
a[1] = a;
a[2] = a;
print(a.join());

(**) Prints "ToString(separator)" in V8/Chakra, but nothing in JSC/SpiderMonkey.

Array.prototype.toString = function() {
    return this.join({
        toString() {
            print("ToString(separator)");
            return ",";
        }
    });
};
var a = []; a.push(a); a.join();

(***) "pre1,pre2,pre1,,post1,post2,post1" in Chakra/V8, but "pre1,pre2,,post2,post1" in JSC/SpiderMonkey.

if (typeof newGlobal !== "function") {
    if (typeof createGlobalObject === "function") {
        var newGlobal = createGlobalObject;
    } else if (typeof WScript === "object") {
        var newGlobal = function() {
            return WScript.LoadScript("this", "samethread");
        };
    } else if (typeof Realm === "object") {
        var newGlobal = function() {
            return Realm.global(Realm.createAllowCrossRealmAccess());
        };
    }
}

var g = newGlobal();

var a = [];
var a2 = g.eval("[]");

a.push("pre1", a2, "post1");
a2.push("pre2", a, "post2");

print(g.Array.prototype.join.call(a));
@ljharb
Copy link
Member Author

ljharb commented Apr 26, 2019

@anba i just missed it :-) do you have interest in taking over this PR? You clearly have much more context than i do.

A shared list seems fine to me; I’d assume per realm and not per agent; we could apply the check either before or after, but if the result will be ignored I’d assume before is preferable.

@devsnek
Copy link
Member

devsnek commented Jun 29, 2020

This diff should work I think (lightly tested in engine262)

diff --git a/spec.html b/spec.html
index 507111a..3325d1f 100644
--- a/spec.html
+++ b/spec.html
@@ -33879,22 +33879,36 @@ THH:mm:ss.sss
         <p>The `join` method takes one argument, _separator_, and performs the following steps:</p>
         <emu-alg>
           1. Let _O_ be ? ToObject(*this* value).
-          1. Let _len_ be ? LengthOfArrayLike(_O_).
-          1. If _separator_ is *undefined*, let _sep_ be the single-element String *","*.
-          1. Else, let _sep_ be ? ToString(_separator_).
-          1. Let _R_ be the empty String.
-          1. Let _k_ be 0.
-          1. Repeat, while _k_ &lt; _len_,
-            1. If _k_ &gt; 0, set _R_ to the string-concatenation of _R_ and _sep_.
-            1. Let _element_ be ? Get(_O_, ! ToString(_k_)).
-            1. If _element_ is *undefined* or *null*, let _next_ be the empty String; otherwise, let _next_ be ? ToString(_element_).
-            1. Set _R_ to the string-concatenation of _R_ and _next_.
-            1. Set _k_ to _k_ + 1.
+          1. Let _realm_ be the current Realm Record.
+          1. If _O_ is in _realm_.[[ArrayJoinCycleSet]], then
+            1. Return *""*.
+          1. Append _O_ to _realm_.[[ArrayJoinCycleSet]].
+          1. Let _R_ be ArrayJoin(_O_, _separator_).
+          1. Let _last_ be the last element of _realm_.[[ArrayJoinCycleSet]].
+          1. Assert: _last_ is _O_.
+          1. Remove the last element of _realm_.[[ArrayJoinCycleSet]].
           1. Return _R_.
         </emu-alg>
         <emu-note>
           <p>The `join` function is intentionally generic; it does not require that its *this* value be an Array object. Therefore, it can be transferred to other kinds of objects for use as a method.</p>
         </emu-note>
+
+        <emu-clause id="ArrayJoin" aoid="ArrayJoin">
+          <h1>ArrayJoin ( _O_ , _separator_ )</h1>
+          <emu-alg>
+            1. Let _len_ be ? LengthOfArrayLike(_O_).
+            1. If _separator_ is *undefined*, let _sep_ be the single-element String *","*.
+            1. Else, let _sep_ be ? ToString(_separator_).
+            1. Let _R_ be the empty String.
+            1. Let _k_ be 0.
+            1. Repeat, while _k_ &lt; _len_,
+              1. If _k_ &gt; 0, set _R_ to the string-concatenation of _R_ and _sep_.
+              1. Let _element_ be ? Get(_O_, ! ToString(_k_)).
+              1. If _element_ is *undefined* or *null*, let _next_ be the empty String; otherwise, let _next_ be ? ToString(_element_).
+              1. Set _R_ to the string-concatenation of _R_ and _next_.
+              1. Set _k_ to _k_ + 1.
+            1. Return _R_.
+          </emu-alg>
+        </emu-clause>
       </emu-clause>
 
       <emu-clause id="sec-array.prototype.keys">
@@ -34464,18 +34478,13 @@ THH:mm:ss.sss
         <p>The following steps are taken:</p>
         <emu-alg>
           1. Let _array_ be ? ToObject(*this* value).
-          1. Let _len_ be ? LengthOfArrayLike(_array_).
-          1. Let _separator_ be the String value for the list-separator String appropriate for the host environment's current locale (this is derived in an implementation-defined way).
-          1. Let _R_ be the empty String.
-          1. Let _k_ be 0.
-          1. Repeat, while _k_ &lt; _len_,
-            1. If _k_ &gt; 0, then
-              1. Set _R_ to the string-concatenation of _R_ and _separator_.
-            1. Let _nextElement_ be ? Get(_array_, ! ToString(_k_)).
-            1. If _nextElement_ is not *undefined* or *null*, then
-              1. Let _S_ be ? ToString(? Invoke(_nextElement_, *"toLocaleString"*)).
-              1. Set _R_ to the string-concatenation of _R_ and _S_.
-            1. Set _k_ to _k_ + 1.
+          1. Let _realm_ be the current Realm Record.
+          1. If _array_ is in _realm_.[[ArrayJoinCycleSet]], then
+            1. Return *""*.
+          1. Append _array_ to _realm_.[[ArrayJoinCycleSet]].
+          1. Let _R_ be ArrayToLocaleString(_array_).
+          1. Let _last_ be the last element of _realm_.[[ArrayJoinCycleSet]].
+          1. Assert: _last_ is _array_.
+          1. Remove the last element of _realm_.[[ArrayJoinCycleSet]].
           1. Return _R_.
         </emu-alg>
         <emu-note>
@@ -34484,6 +34493,25 @@ THH:mm:ss.sss
         <emu-note>
           <p>The `toLocaleString` function is intentionally generic; it does not require that its *this* value be an Array object. Therefore it can be transferred to other kinds of objects for use as a method.</p>
         </emu-note>
+
+        <emu-clause id="ArrayToLocaleString" aoid="ArrayToLocaleString">
+          <h1>ArrayToLocaleString ( _array_ )</h1>
+          <emu-alg>
+            1. Let _len_ be ? LengthOfArrayLike(_array_).
+            1. Let _separator_ be the String value for the list-separator String appropriate for the host environment's current locale (this is derived in an implementation-defined way).
+            1. Let _R_ be the empty String.
+            1. Let _k_ be 0.
+            1. Repeat, while _k_ &lt; _len_,
+              1. If _k_ &gt; 0, then
+                1. Set _R_ to the string-concatenation of _R_ and _separator_.
+              1. Let _nextElement_ be ? Get(_array_, ! ToString(_k_)).
+              1. If _nextElement_ is not *undefined* or *null*, then
+                1. Let _S_ be ? ToString(? Invoke(_nextElement_, *"toLocaleString"*)).
+                1. Set _R_ to the string-concatenation of _R_ and _S_.
+              1. Set _k_ to _k_ + 1.
+            1. Return _R_.
+          </emu-alg>
+        </emu-clause>
       </emu-clause>
 
       <emu-clause id="sec-array.prototype.tostring">
@anba
Copy link
Contributor

anba commented Jun 29, 2020

How is that diff different from #1518 (comment)?

@devsnek
Copy link
Member

devsnek commented Jun 29, 2020

@anba i think it's pretty much the same as the diff you posted. Aside from the differences between engines that you mention in your comment above, it seems to match how engines actually behave from my testing (for example, storing the cyclic stack per-realm).

@ljharb ljharb force-pushed the master branch 2 times, most recently from 3d0c24c to 7a79833 Compare June 29, 2021 02:21
@ljharb ljharb added the editor call to be discussed in the next editor call label Dec 6, 2023
@michaelficarra
Copy link
Member

Instead of adding a slot for a seen list on the intrinsic, it should be on the agent. It cannot be on the intrinsic because that would not catch cycles across realms (but they can obviously exist since objects can be shared across realms).

@michaelficarra michaelficarra removed the editor call to be discussed in the next editor call label Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
normative change Affects behavior required to correctly evaluate some ECMAScript source text web reality
8 participants