Unified input validation
Decision
We use typebox for all input validation.
Problems
We currently use many disparate libraries for validation, depending on the service type, be it monolithic servers, frontend apps, service-oriented CDK apps. It blocks us from reusing the same business logic code in multiple places (e.g. when validating an event that is sent between services). It also leads to friction when using validation for a new feature when devs need to adapt to the different approaches. Sometimes the validation is also performed manually, which requires more effort and is harder to perform in full.
The goal of this task is to look through different validation methods and find the default one for us to use. The solution should be easy enough to use that adding validation early-on is a no-brainer and it should work regardless of whether we validate an endpoint input, customer/form input, a config, or anything else.
Context
We currently have many different ways of validating input. We use yup mostly in admin-dashboard and client-dashboard-2, ajv mostly in hermione and filch, zod in supplier-manager and suretype in our CDK-based services. In many places we handle the validation logic manually (most of the original hermione endpoints and filch jobs), and in some places barely at all (e.g. put-automations-id endpoint).
For choosing the validation library, we folowed this list of requirements:
- (
checklist) the lib gets through the Choosing dependency ADR checklist (basically, is a non-trivial popular and well-maintained open source library) - (
ts-types) infers TS types (or is straight TS types based) - (
json-schema) supports JSON schema: either directly based on it, or can be converted into (to be able to use it in ApiGateway or others services that only support JSON schema) - (
form-validation) is integrable with form validation, preferably there's an adapter (e.g. for react-hook-form in cd2) - (
inference-perf) high dev time performance (doesn't slow down IDE when infering TS types) - (
no-framework) uses only straight JS, no additional framework-specific libraries (because validation logic is business logic and therefore should be agnostic of which tech stack it is used in) - (
approachable) is easy enough to write so that adding validation early on is a no-brainer (this is kinda subjective, in practice we should avoid more esoteric approaches) - (
no-generation) no code generation step required: so we can use the same IDE tooling as for other code we write (though for high-volume validations it may actually worth be looking into) - (
runtime-perf) (optional) high runtime performance (we may need to verify each incoming recipient, which might be millions at a time - if necessary, we may go for a different solution in this usecase, but it would still be nice if one lib could handle both situations) - (
usable-in-lambda) (optional) can be used in lambda, meaning size is not too big (though for CDK's RestApi we may actually directly use RequestValidator and Model)
Options
Checked on 2024-06-28
| Name | Weekly Downloads | Unpacked Size | Last publish | Ops / sec ^ | | ------------------- | ---------------- | ------------- | ------------ | ----------- | | already in use | | | | | | suretype | 34 957 | 178 kB | ! a year ago | 45888428 | | yup | 6 010 821 | 256 kB | 4 months ago | 49089 | | ajv | 104 628 910 | 1.03 MB | 24 days ago | 36087310 | | zod | 10 113 395 | 667 kB | a month ago | 732610 | | also considered | | | | | | @sinclair/typebox | 30 775 766 | 1.37 MB | 6 days ago | 72960917 | | valibot | 205 846 | 1.41 MB | 3 days ago | 2400861 | | superstruct | 1 374 153 | 179 kB | 8 days ago | 213117 | | rejected | | | | | | joi | 9 993 000 | 531 kB | 9 days ago | - | | vest | 17 584 | 1.46 MB | 10 days ago | - | | typia | 41 940 | 3.84 MB | a day ago | 63246549 | | io-ts | 1 195 982 | 460 kB | 7 months ago | 2760554 | | arktype | 33 463 | 146 kB | 3 days ago | 69527419 |
^ runtime benchmark results for assertLoose from typescript-runtime-type-benchmarks (higher is better)
Reasoning
Starting with checklist - all libraries:
- are non-trivial
- have >10k monthly downloads on npm
- are open source: all licensed under MIT (except joi: BSD-3-Clause)
- were updated during the last 6 months (except suretype)
Already in use:
suretype
- we already use it in multiple applications (CDK-based)
checklistno (last updated over a year ago, GitHub issues also seem kinda abandoned - though arguably it may be treated as stable library case)ts-typesyesjson-schemayes (conversion to JSON Schema throughtypeconvpackage)form-validationno (no plans for react-hook-form resolver (https://dev.to/samchon/typescript-json-is-10-1000x-times-faster-than-zod-and-io-ts-8n6#comment-228pg))inference-perfyes (no problem reports found)no-frameworkyesapproachableyesno-generationyesruntime-perf(optional) yesusable-in-lambda(optional) yes
Verdict: rejected due to poor support
zod
- we already use it in
supplier-manager checklistyests-typesyesjson-schemanoform-validationyes (has react-hook-form resolvers)inference-perfreportedly has severe types inference performance issues-
- Benchmark: "seems to have some structural issues with its types, ballooning in cost by 10+ times after just 2 .extend and 1 .omit call"
no-frameworkyesapproachableyesno-generationyesruntime-perf(optional) no (also no compilation mode)usable-in-lambda(optional) yes
Verdict: rejected due to devtime and runtime performance
ajv
- we already use it in
hermione checklistyests-typesyes (through json-schema-to-ts)json-schemayes (directly uses JSON Schema)form-validationyes (has react-hook-form resolvers)inference-perfno (no problems with json-schema-to-ts IDE performance reported)no-frameworkyesapproachableyesno-generationyes (it uses startup compiler, which may take a while but then works really fast; and also supports buildtime compilation)runtime-perf(optional) yesusable-in-lambda(optional) yes (is big and requires companion package ajv-formats, but the JSON schema may be directly used by RequestValidator)
Verdict: contender, lost in product team voting
Also considered:
typebox
checklistyests-typesyesjson-schemayes (maps directly to JSON Schema)form-validationyes (has react-hook-form resolvers)inference-perfyesno-frameworkyesapproachableyesno-generationyes (compilation not required, but available)runtime-perf(optional) yes, can work in compiled modeusable-in-lambda(optional) yes (size kinda large, but potentially can generate JSON Schema for the RequestValidator during cdk deploy)
Verdict: contender, won in product team voting
valibot
checklistyests-typesyesjson-schemano (packagevalibot-json-schemaexists, but: "Some of the features of Valibot can't be converted to JSON schema")form-validationyes (has react-hook-form resolvers)inference-perfno (no reports found)no-frameworkyesapproachableyesno-generationyesruntime-perf(optional) no (no compilation mode)usable-in-lambda(optional) yes (modular design to support tree-shaking)
Verdict: rejected due to no JSON Schema interoperability
superstruct
checklistyests-typesyesjson-schemanoform-validationyes (has react-hook-form resolvers)inference-perfyesno-frameworkyesapproachableyesno-generationyesruntime-perf(optional) no (no compilation mode)usable-in-lambda(optional) yes
Verdict: rejected due to no JSON Schema interoperability
Outright rejected:
yup
ts-typesno
joi
ts-typesno
vest
ts-typesno (manual typing)
typia
form-validationno (has no react-hook-form resolvers)no-generationno (requires Ahead of Time compilation)When you write a TypeScript code calling typia.createIs() function and compile it through tsc command, typia will replace the typia.createIs() statement to optimal validation code in the compiled JavaScript file, for the IMember type.
io-ts
approachableno (relies on monads from fp-ts, may be a bit too weird?)The Either type returned by decode is defined in fp-ts, a library containing implementations of common algebraic types in TypeScript.
const decoded = User.decode(data); // Either<Errors, User>
if (isLeft(decoded)) {
throw Error(
`Could not validate data: ${PathReporter.report(decoded).join("\n")}`
);
}
arktype
json-schemano (uses its own object schema format, not compatible with JSON Schema, not supported by IDEs)
type({
keyword: "null",
stringLiteral: "'TS'",
numberLiteral: "5",
bigintLiteral: "5n",
union: "string|number",
intersection: "boolean&true",
array: "Date[]",
grouping: "(0|1)[]",
objectLiteral: {
nested: "string",
"optional?": "number"
},
tuple: ["number", "number"]
})
Consequences
How do we implement this change?
Whenever a team writes a new endpoint, configuration, or other code that benefits from validation, they use typebox. If the code requires frontend validation, the team will also use the opportunity to create an ADR about forms and form validation to decide on other libraries needed. Another ADR will be needed for the solution to generate client libraries (look below in "What is still unclear"). Also, if not time-prohibiting, every team will update code to the unified validation in the course of normal refactoring. We will probably not attempt to refactor the code purely to update the validation method.
Who will implement the change?
All the developers.
How do we teach this change?
The teaching branch was presented where typebox was shown replacing several typical usecases (endpoint, config, CDK restapi).
Adrian will soon run a Learning Friday workshop to introduce typebox.
What could go wrong?
typebox may lack some functionality that we need which was not discovered during ADR investigation phase.
What do we do if something goes wrong?
We do the missing validation by hand - or switch to ajv which should be mostly compatible.
What is still unclear?
What about generating the client library from REST endpoints spec (like the OpenAPI schema -> TS SDK thing that we have in cd2)?
RTK Query ADR says we use @rtk-query/codegen-openapi which requires OpenAPI spec. Specifically for AWS CDK's RestApi it's possible to use the OpenAPI spec as the routing/validation/handler config. You could even try to build it on the fly.