Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[--env-file] Space between = and " will cause the value to be parsed as an unquoted string #53461

Closed
panzi opened this issue Jun 15, 2024 · 15 comments
Labels
confirmed-bug Issues with confirmed bugs. dotenv Issues and PRs related to .env file parsing

Comments

@panzi
Copy link

panzi commented Jun 15, 2024

Meaning if you have this in your .env file:

FOO= "BAR"

This command:

node --env-file=.env -e 'console.log({ FOO: process.env.FOO })'

Will print this:

{ FOO: '"BAR"' }

I.e. the variable value includes the quotes. (The spaces aren't included in the value, though.) I think this is unexpected.

If this is not intentional there needs to be white space skipping before this line:

if (content.front() == '"') {

@RedYetiDev
Copy link
Member

Thanks for the bug report! Feel free to submit a PR with your requested changes, and someone will review it.

@RedYetiDev RedYetiDev added the dotenv Issues and PRs related to .env file parsing label Jun 15, 2024
@panzi
Copy link
Author

panzi commented Jun 15, 2024

Sorry, I'm not interested in contributing C++ code to NodeJS at the moment. I just noticed a lot of quirks with... well basically almost all dotenv parsers out there and am currently comparing them for fun. There are many more quirks in NodeJS's dotenv parser, but the others are for a bit more unusual situations. Having a space before the quote isn't that unusual though, I think. The other quirks that I'm just now writing down (copied form that document) are (in case you're interested):

This dialect supports strings quoited in double quotes ("), single quotes (')
and back ticks (`). These strings can be multi-line, but only in double quoted
strings \n will be translated to newlines.

If the second quote is missing only the current line is used as the value for
the variable. Parsing of more variables continues in the next line!

Quotes start with #. There doesn't need to be a space before the #.

Keys may contain anything except spaces ( ), including tabs and newlines.
Meaning this:

FOO#=1
BAR
=2

Is equivalent with this JSON:

{ "FOO#": "1", "BAR\n": "2" }

Lines with syntax errors (i.e. no =) are silently ignored, but they will trip
up the parser so that the following correct line is also ignored.

Leading export will be ignored. Yes, the export needs to be followed by a
space. If its a tab its used as part of the key.

@RedYetiDev
Copy link
Member

No problem! The parser specification hasn't been done yet, but it's being discussed at #49148.

@IlyasShabi i know you worked on writing a specification, any insight? (Thanks!)

@tniessen
Copy link
Member

@panzi There is no proper spec for this feature, the only reference is "whatever dotenv does." Do you happen to know what the expected behavior is, based on that?

@RedYetiDev
Copy link
Member

RedYetiDev commented Jun 15, 2024

Dotenv skips the white space

var dotenv = require("dotenv")
console.log(dotenv.parse("a= 'b'"))
// {a: "b"}

For the other case:

var dotenv = require("dotenv")
console.log(dotenv.parse(`
FOO#=1
BAR
=2
`))
// {BAR: "2"}
@RedYetiDev
Copy link
Member

IIUC, this is your list of quirks:

  1. This dialect supports strings quoted in double quotes ("), single quotes ('), and backticks (``).
    • These strings can be multi-line, but only in double-quoted strings \n will be translated to newlines.
  2. If the second quote is missing, only the current line is used as the value for the variable.
    • Parsing of more variables continues in the next line.
  3. Quotes start with #.
    • There doesn't need to be a space before the #.
  4. Keys may contain anything except spaces, including tabs and newlines.
  5. Lines with syntax errors (e.g., no =) are silently ignored.
    • They will cause the parser to ignore the following correct line as well.
  6. Leading export will be ignored.
    • The export needs to be followed by a space. If it is a tab, it is used as part of the key.
@anonrig anonrig added the confirmed-bug Issues with confirmed bugs. label Jun 15, 2024
@anonrig anonrig self-assigned this Jun 15, 2024
@anonrig
Copy link
Member

anonrig commented Jun 15, 2024

Nice bug report. I'll fix it.

@panzi
Copy link
Author

panzi commented Jun 16, 2024

@panzi There is no proper spec for this feature, the only reference is "whatever dotenv does." Do you happen to know what the expected behavior is, based on that?

While this implementation is not behaving quite like the npm dotenv package I'm not sure if emulating that behavior 100% is even a good idea. That package parses this:

FOO
=BAR

As:

{ "FOO": "BAR" }

Is that a good idea?

Just for fun I'm looking at different dotenv implementations, write down how they work and try to implement a compatible parser in Rust (except for deliberate differences where e.g. I won't import a Unicode database to resolve named Unicode escape sequences (Python dotenv-cli), don't emulate obvious mangled encoding handling (Python dotenv-cli), and don't implement what I consider arbitrary code execution vulnerabilities (Ruby dotenv)). Here is what I have written down so far, in case you are interested in maybe comparing the sections about NodeJS and JavaScript dotenv: https://github.com/panzi/punktum#nodejs-dialect
I'm sure I have lots of English spelling mistakes and the document could be structured better. It's not finished in any case.

Looking at all the dotenv implementations I don't know what I'd expect anymore, except that white space around keys and values isn't significant.

@tniessen
Copy link
Member

While this implementation is not behaving quite like the npm dotenv package I'm not sure if emulating that behavior 100% is even a good idea.

I am not claiming that it's a good idea, but as far as I can tell, it is what's been happening thus far. And deviating from that convention ad-hoc, without a proper specification, IMO doesn't seem like a good idea either.

@panzi
Copy link
Author

panzi commented Jun 17, 2024

Just for fun I wrote my own implementation of the JavaScript dotenv parser in C++, trying to be 100% compatible. I haven't found an edge case where it produces the any different output to the original. I've licensed it under MIT, if you want you can just copy-paste it into NodeJS. If you prefer any other open source license I can do that, too. See: https://github.com/panzi/cpp-dotenv

I'm not a C++ pro, so things might not be 100% ideal. Also I write the environment into an std::unordered_map<std::string, std::string> and use some C++20 features (std::string_view::find_first_not_of() and std::string_view::find_last_not_of()). See also all the comments that say "SYNTAX ERROR". While the original just ignores these, and I also ignore them the same way the original does, you might want to log errors there. The original supports DOTENV_CONFIG_DEBUG=true, but it only logs things like that it couldn't read the .env file, that there where encoding errors (this C++ version ignores encodings, it will work with ASCII, ISO-8859-1, and UTF-8, since it only compares with ASCII characters), or that it did/didn't overwrite an pre-existing environment variable (depending on DOTENV_CONFIG_OVERRIDE=ture - yes, override, not overwrite). Instead maybe you could make it so it logs these syntax errors when that environment variable is ste to true? In my (also just for fun) Rust implementation I did just that.

DOS line ending support needs testing. And might be possible to somehow implement without always allocating a new string.

@panzi
Copy link
Author

panzi commented Jun 26, 2024

I found yet another edge case where my (and the NodeJS) implementation behaved differently than the JavaScript implementation and fixed it in my code. Would you like to use that code in NodeJS? I didn't submit it as a pull request because I assume building a huge project like NodeJS might be a hassle and that's just some fun thing I did in my spare time. Maybe I'll do that anyway at some point, but only if there is interest in this.

@thisalihassan
Copy link
Contributor

I would like to work on it if it is still up for grabs

@RedYetiDev
Copy link
Member

I would like to work on it if it is still up for grabs

IIRC @anonrig said he'd work on it, so you'd have to ask him.

@panzi
Copy link
Author

panzi commented Jul 7, 2024

If you don't just use my code you can still look at the source comments about correctly emulating the behavior of the regular expression of the original (without using regular expressions in C++).

MOHIT51196 added a commit to MOHIT51196/node that referenced this issue Jul 9, 2024
Fix to ignore spaces between '=' and quoted string in env file to avoid quotes in env values

Fixes: nodejs#53461

Signed-off-by: Mohit Malhotra <dev.mohitmalhotra@gmail.com>
@RedYetiDev
Copy link
Member

Fixed by #53786

@anonrig anonrig removed their assignment Jul 9, 2024
MOHIT51196 added a commit to MOHIT51196/node that referenced this issue Jul 9, 2024
Fix to ignore spaces between '=' and quoted string in env file

Fixes: nodejs#53461

Signed-off-by: Mohit Malhotra <dev.mohitmalhotra@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. dotenv Issues and PRs related to .env file parsing
5 participants