Reduce code size by optionally refactoring errors into a function #1098

fabiospampinato · 2019-10-06T23:36:33Z

I'm using ajv in a library for managing settings, and while optimizing it I think I discovered a few potentially good ideas for optimizing ajv's initialization time.

ajv is plenty fast when validating, but in order to get there one has to first instantiate it and compile the schema, which are not particularly fast operations.

Lazy loading dependencies

It takes about ~30ms to require ajv on my machine, that's not too bad but I think it could be brought down further by lazy loading dependencies once they are actually about to be used.

Disabling schema caching and serialization

There's no standalone option for disabling schema caching and serialization, currently one has to set the following options, which are a bit tedious to write (and perhaps kind of difficult to come up with):

{
  serialize: false,
  cache: {
    put: () => {},
    get: () => {},
    del: () => {},
    clear: () => {}
  }
}

I think there are some very common use cases where schema caching is just not useful, for example if I only have a single schema, or if I never re-compile the same schemas, then this is just:

wasting time serializing objects.
wasting time importing fast-json-stable-stringify without using it (under some scenarios at least).
- I'm also not sure fast-json-stable-stringify is needed, JSON.stringify is faster, and although the order of object properties is not guaranteed in JS in practice V8 keeps them ordered, I think the other major engines probably do the same, and they aren't going to change this for backwards compatibility.
wasting memory storing the string representation of our schemas, plus potentially older compiled schemas no longer in use.

I think a standalone option for disabling schema caching entirely, and a mention in the readme that a slight performance and memory gain can be achieved by disabling it, would be a good idea.

Shorter validation functions

This is perhaps the biggest optimization opportunity I found.

The generated validation function is too long and grows too fast. I've attached a sample source code generated by compiling a very simple schema with just about 100 properties in it, it weighs 40kb!

As soon as the schema becomes more complicated, and especially if more properties are added, the generated source code may very well weigh 1mb or more. All this source code wastes memory and it needs to be parsed and that's not free, especially on mobile.

Most of the source code is actually about generating error objects and updating the related counters for each validation block. I think this issue should be addressed like so:

when it.level === 0 an helper function for generating the error object for that specific validation function should be outputted, then it should be called in the rest of the file when generating the error objects.
- Most of the validations pass and validation is usually stopped as soon as the first error is found, so the extra function call here won't have any even remotely meaningful cost performance-wise, perhaps the engine might even inline it itself.

I think the generated source code under my example scenario could weigh about 15kb rather than 40kb just by implementing this, a reduction of about 60%.

source.js

I hope these suggestions are useful, they might not sound like much under some scenarios, but I think they can make the difference under others.

The text was updated successfully, but these errors were encountered:

epoberezkin · 2019-11-24T11:32:07Z

@fabiospampinato thanks for the suggestions

Disabling schema caching and serialization

The simplest change without adding options would be to add cache: false (and not call any cache methods in this case), and add the suggested note to readme.

Shorter validation functions

I thought about it and tried extracting error generation into a helper function long time ago, but it is not as simple as it may seem - it requires substantial re-factoring... See this branch https://github.com/epoberezkin/ajv/commits/error-function

fabiospampinato · 2019-11-26T16:39:01Z

The simplest change without adding options would be to add cache: false (and not call any cache methods in this case), and add the suggested note to readme.

Sounds like the best solution 👍.

I thought about it and tried extracting error generation into a helper function long time ago, but it is not as simple as it may seem - it requires substantial re-factoring... See this branch error-function (commits)

Does it "just" require a lot of manual refactoring or are there any technically challenging problems of this approach? Because I'm using it here, and other than for writing/reading kind-of minified JS (which could still be generated via an external tool to keep the code readable) it was relatively simple to do.

epoberezkin · 2019-11-26T17:57:31Z

Does it "just" require a lot of manual refactoring or are there any technically challenging problems of this approach? Because I'm using it here, and other than for writing/reading kind-of minified JS (which could still be generated via an external tool to keep the code readable) it was relatively simple to do.

As I wrote it was long time ago :) I honestly don't remember what blocked me when I tried. It was likely some challenge on the template level where there are lots of template level conditionals in string constants...

epoberezkin · 2020-09-15T09:05:17Z

It should be relatively easy to address by optionally making error generation into a separate function stored in the shared scope (from v7-alpha). Given that it is a trade-off (smaller function and faster parsing, but also slower execution on failing cases) it should be done with option (name TBC).

Documentation on code generation would help (TODO).

epoberezkin · 2020-09-15T09:06:49Z

the branch error-function is removed as no longer relevant

epoberezkin · 2021-02-10T12:30:32Z

Schemas are no longer serialised, but it did cause some confusion - #1413

Renamed issue to show what is left.

epoberezkin · 2021-03-14T09:47:12Z

The solution is to use messages: false and if messages are needed they can be generated with ajv-i18n (they are effectively refactored there). Other parameters in error objects are needed anyway (they would be passed as error function parameters if it were refactored, so no real benefit).

fabiospampinato added the enhancement label Oct 6, 2019

epoberezkin added this to the 7.0.0 milestone Sep 15, 2020

willfarrell mentioned this issue Jan 2, 2021

Version 2 middyjs/middy#585

Closed

63 tasks

epoberezkin modified the milestones: 7.0.0, 8.0.0 Feb 10, 2021

epoberezkin changed the title ~~Optimize initialization time~~ Feb 10, 2021

epoberezkin mentioned this issue Feb 11, 2021

Change from v6 to v7: schema object itself used as a key for compiled schema function, not serialization #1413

Open

epoberezkin closed this as completed Mar 14, 2021

epoberezkin mentioned this issue Aug 1, 2021

Breaking: Ajv to v8.5.0, added ajv-draft-04 (fixes #13888) eslint/eslint#13911

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce code size by optionally refactoring errors into a function #1098

Reduce code size by optionally refactoring errors into a function #1098

fabiospampinato commented Oct 6, 2019 •

edited

Loading

epoberezkin commented Nov 24, 2019

fabiospampinato commented Nov 26, 2019 •

edited

Loading

epoberezkin commented Nov 26, 2019 •

edited

Loading

epoberezkin commented Sep 15, 2020

epoberezkin commented Sep 15, 2020

epoberezkin commented Feb 10, 2021 •

edited

Loading

epoberezkin commented Mar 14, 2021

Reduce code size by optionally refactoring errors into a function #1098

Reduce code size by optionally refactoring errors into a function #1098

Comments

fabiospampinato commented Oct 6, 2019 • edited Loading

Lazy loading dependencies

Disabling schema caching and serialization

Shorter validation functions

epoberezkin commented Nov 24, 2019

fabiospampinato commented Nov 26, 2019 • edited Loading

epoberezkin commented Nov 26, 2019 • edited Loading

epoberezkin commented Sep 15, 2020

epoberezkin commented Sep 15, 2020

epoberezkin commented Feb 10, 2021 • edited Loading

epoberezkin commented Mar 14, 2021

fabiospampinato commented Oct 6, 2019 •

edited

Loading

fabiospampinato commented Nov 26, 2019 •

edited

Loading

epoberezkin commented Nov 26, 2019 •

edited

Loading

epoberezkin commented Feb 10, 2021 •

edited

Loading