Node - Js Course For GitHub
Node - Js Course For GitHub
JS
Yoni Goldberg
About Me – Yoni Goldberg
Backend. Only
Node. Only
Top 15 Node.JS projects on GitHub
Some Formalities Before We Take Off
Course flow
Prerequisites
Environment setup
Supporting materials
Questions (www.clarify.live)
Post course meeting
Intro
What is Node.JS?
It All
Began 8
Years
Ago…
2018 - 8 Years Later
8M 1st 2nd
Servers worldwide Most popular Most loved
framework technology
(StackOverflow) (StackOverlow)
Platforms Packages
Compared
200,000
Maven Java – 220,000
150,000
100,000 Nugget .NET - ~100,000
page
07
Platforms Packages
Compared
720,000
NPM Node.JS – 596,000
page
08
What Is Node Secret Sauce?
john doe
1. Node Secret Sauce - House Of Many Ideas
OOP vs Functional
726,000 packages
+ Service
Architecture + Web
Framework
+ Testing
+ Production Setup
+ Node Engine
Section
Development Basics
The Basic Pillars - JS
The Basic Pillars - Modules
Creating Modules
Module = file
Asynchronous Programming
Async Programming - Context
1% Executing
Synchronous Programming – Simple And Wasteful
We didn’t wait. But who will handle the result once it arrives?
100% Executing
A-Synchronous Programming – Efficient And Complex
Node uses one thread which can handle ~10,000 requests per
second
Nesting
Try-Catch
Debugging
Stacktrace
Synchronous loop
Asynchronous loop
Bottom Line
Option 1: Callback
Get input JSON with configuration – return in English or with local translation
Get current user id from DB
Fetch user orders
Based on orders, get all user products (translated or in English)
Return products. That’s it
Callback Exercise
Nesting High
Try-Catch No
Debugging Tolerable
Synchronous loop No
Asynchronous loop No
Bottom Line
Option 2: Promises
Nesting High
Try-Catch No
Debugging Tolerable
Synchronous loop No
Asynchronous loop No
Bottom Line
Option 3: Async/await
}
Async/await Exercise
+ Testing
+ Production Setup
= Great Solution
Anatomy Of Any Web App
Express Coverage
#must #quick-start
Express
Node.JS Framework
Popularity
page
047
What Express Brings To The Table
Project Structure
Inspiration
Benefits:
Small is simple
Deploy with confidence
Resiliency
Tech agnostic
Granularly scalable
Moving From Spaghetti
To Ravioli
Microservices guidelines
#must #quick-start
2004 2018
DDD Core Principle -
Isolate The Domain
#must #quick-start
This is
where the
business
features live
DDD Layers In Node.JS
#must #quick-start
Express, etc
Custom code
NPM
Clean Architecture
#must #quick-start
“Heavy architecture” (Martin Fowler)
#must #quick-start
Practical Project Structure Guidelines
Express != microservice
#must #quick-start
Anatomy Of a Microservice
#must #quick-start
Project Structure Demo
Section
NPM
NPM – Overview
npm init
Code it!
npm adduser
npm publish
Publishing NPM Package Exercise
1.7.3
MAJOR version
when you make
incompatible API
changes
Error Handling
The Ultimate Goals Of Error Handling
01 02 03
Visibility through Optimized API results Maximum uptime
operational dashboard
Error Handling Typical Flow
Throwing Errors
const user = await logIn("username",
"password");
if(!user){
throw new appError('userDoesntExist', 400,
'Couldnt find the user when trying to get
products', true);
}
Catchers
Local vs Global
Promise vs Async/await
Uncaught exceptions
Error Handler
If(!err.isOperational){
process.exit(1)
}
Operational vs Developer Errors
Definition Run-time errors that the program can predict or reason Errors that the program can’t reason about their impact
about
Examples Invalid input, query timeout, 3rd party server not Compilation error, startup error, unknown error, errors
responding originated from ‘stateful’ objects
Node Mechanics
Why Should I Care About
How Node Works?
01 02
When future code will run? When does my code become CPU-
Differences between various API:
setTimeout vs setImmediate vs
intensive?
callback, etc How sensitive is Node to synchronous
code?
03
Should I cluster?
Use one Node process or multiple?
The first thing
About Node.JS mechanics
Single Threaded
Non-Blocking
The first thing
About Node.JS mechanics
Single Threaded
Non-Blocking
Node.JS
Building Blocks
JS Engine
02 03 04
Node.JS
Building Blocks
JS Engine
02 03 04
Node.JS Flags:
Building Blocks --POOL_SIZE
JS Engine
02 03 04
4 Facts About
V8
01 02
Fast Single Threaded
Transform JS into machine code Runs using one thread only
03 04
Replaceable Sometimes not fast enough
Other vendors aim to provide their own GC, dynamic typing, etc – all come with a
implementation price
page
095
Typical Program Flow
Demonstrating
Node internals with
a real code
Code V8 Stack Async API
console.log(‘starting’)
validate()
saveUser()
console.log(‘end’)
Main
console.log(‘starting’)
validate() Console.log
saveUser()
console.log(‘end’)
Main
console.log(‘starting’)
validate()
saveUser()
console.log(‘end’)
Main
console.log(‘starting’)
validate() Validate()
saveUser()
console.log(‘end’)
Main
console.log(‘starting’)
validate()
saveUser()
console.log(‘end’)
Main
console.log(‘starting’)
validate() saveUser()
saveUser()
console.log(‘end’)
Main
console.log(‘starting’)
validate()
saveUser()
console.log(‘end’)
Main
console.log(‘starting’)
validate() Console.log()
saveUser()
console.log(‘end’)
Main
console.log(‘starting’)
validate()
saveUser()
console.log(‘end’)
Main
console.log(‘starting’)
validate()
saveUser()
console.log(‘end’)
validate = () =>
{console.log(‘validate’);saveUser()}; DB.save()
console.log(‘end’) Main
page
0108
Part 6
Continuous Quality
Testing & Inspecting
Continuous Integration Inspection Trends
Old school
Modern
Demo
CircleCI GitLab
CodeShip Travis
B
Easy
AWS CodeBuild
A
Jenkins
Popularity
Flexible
Take Away
01 02
Simple CI? Easy! Everything Docker
Getting test, coverage and linting CI tools are built around dockers - plan
automation is now a breeze your application around Docker
03 04
Plenty Of Tools Developer Experience
The market is not short of rich analysis CI is merely a bot that should buy
tools, don’t miss your free lunch developers time to handle strategic issues
page
0113
Part 7
DevOps
Production High-level
Topology
thank you.
Appendix
Node.JS
Best Practices
Appendix
“...if you were looking at the architecture of a library, you’d likely see a
grand entrance, an area for check-in-out clerks, reading areas, small
conference rooms, and gallery after gallery capable of holding
bookshelves for all the books in the library. That architecture would
scream: Library.”
Good:
Structure your
solution by
self-contained
components
1.1 Structure your solution by components
Bad: Group
your files by
technical
role
1.2 Layer your components, keep Express within its
boundaries
Otherwise: App that mixes web objects with other layers can not be
accessed by testing code, CRON jobs and other non-Express callers
1.2 Layer your components, keep Express within its
boundaries
Separate
component
code into
layers: web,
services, and
DAL
1.3 Wrap common utilities as NPM packages
Once you start growing and have different components on different servers
which consumes similar utilities, you should start managing the
dependencies - how can you keep 1 copy of your utility code and let
multiple consumer components use and deploy it? well, there is a tool for
that, it's called npm... Start by wrapping 3rd party utility packages with your
own code to make it easily replaceable in the future and publish your own
code as private npm package. Now, all your code base can import that code
and benefit free dependency management tool. It's possible to publish npm
packages for your own private use without sharing it publicly using
private modules, private registry or local npm packages
1.3 Wrap common utilities as NPM packages
Sharing your
own common
utilities across
environments
and
components
1.4 Separate Express 'app' and 'server'
Otherwise: Your API will be accessible for testing via HTTP calls only (slower
and much harder to generate coverage reports). It probably won't be a big
pleasure to maintain hundreds of lines of code in a single file
1.4 Separate Express 'app' and 'server'
The latest Express generator comes with a great practice that is worth to
keep - the API declaration is separated from the network related
configuration (port, protocol, etc). This allows testing the API in-process,
without performing network calls, with all the benefits that it brings to the
table: fast testing execution and getting coverage metrics of the code. It also
allows deploying the same API under flexible and different network
conditions. Bonus: better separation of concerns and cleaner code
Code example: API declaration, should reside in app.js
/**
* Get port from environment and store in Express.
*/
/**
* Create HTTP server.
*/
request(app)
.get('/user')
.expect('Content-Type', /json/)
.expect('Content-Length', '15')
.expect(200)
.end(function(err, res) {
if (err) throw err;
});
1.5 Use environment aware, secure and hierarchical config
doWork()
.then(doWork)
.then(doOtherWork)
.then((result) => doWork)
.catch((error) => {throw error;})
.then(verify);
Anti pattern code example – callback style error handling
“……And in fact, callbacks do something even more sinister: they deprive us of the stack,
which is something we usually take for granted in programming languages. Writing code
without a stack is a lot like driving a car without a brake pedal: you don’t realize how badly
you need it until you reach for it and it’s not there. The whole point of promises is to give
us back the language fundamentals we lost when we went async: return, throw, and the
stack. But you have to know how to use promises correctly in order to take advantage of
them.”
// throwing a string lacks any stack trace information and other important data properties
if(!productToAdd)
throw ("How can I add new product when no value provided?");
Code example – doing it even better
appError.prototype.__proto__ = Error.prototype;
module.exports.appError = appError;
“…Personally, I don’t see the value in having lots of different types of error
objects – JavaScript, as a language, doesn’t seem to cater to Constructor-
based error-catching. As such, differentiating on an object property seems far
easier than differentiating on a Constructor type…”
“…One problem that I have with the Error class is that is not so simple to extend. Of course,
you can inherit the class and create your own Error classes like HttpError, DbError, etc.
However, that takes time and doesn’t add too much value unless you are doing something
with types. Sometimes, you just want to add a message and keep the inner error, and
sometimes you might want to extend the error with parameters, and such…”
“…All JavaScript and System errors raised by Node.js inherit from, or are instances of, the standard JavaScript
Error class and are guaranteed to provide at least the properties available on that class. A generic JavaScript
Error object that does not denote any specific circumstance of why the error occurred. Error objects capture a
“stack trace” detailing the point in the code at which the Error was instantiated, and may provide a text
description of the error. All errors generated by Node.js, including all System and JavaScript errors, will either be
instances of or inherit from, the Error class…”
From
Node.js official documentation
2.3 Distinguish operational vs programmer errors
Otherwise: You may always restart the application when an error appears, but
why let ~5000 online users down because of a minor, predicted, operational
error? the opposite is also not ideal – keeping the application up when an
unknown issue (programmer error) occurred might lead to an unpredicted
behavior. Differentiating the two allows acting tactfully and applying a balanced
approach based on the given context
2.3 Distinguish operational vs programmer errors
// or if you're using some centralized error factory (see other examples at the bullet "Use only the built-in Error object")
function appError(commonType, description, isOperational) {
Error.call(this);
Error.captureStackTrace(this);
this.commonType = commonType;
this.description = description;
this.isOperational = isOperational;
};
“…The best way to recover from programmer errors is to crash immediately. You should run your
programs using a restarter that will automatically restart the program in the event of a crash. With a
restarter in place, crashing is the fastest way to restore reliable service in the face of a transient
programmer error…”
“…By the very nature of how throw works in JavaScript, there is almost never any way to safely “pick
up where you left off”, without leaking references, or creating some other sort of undefined brittle
state. The safest way to respond to a thrown error is to shut down the process. Of course, in a
normal web server, you might have many connections open, and it is not reasonable to abruptly shut
those down because an error was triggered by someone else. The better approach is to send an
error response to the request that triggered the error while letting the others finish in their normal
time, and stop listening for new requests in that worker.”
From
Node.js official documentation
"Otherwise you risk the state of your application"
“…So, unless you really know what you are doing, you should perform a graceful
restart of your service after receiving an “uncaughtException” exception event.
Otherwise, you risk the state of your application, or that of 3rd party libraries to
become inconsistent, leading to all kinds of crazy bugs…”
// API route code, we catch both sync and async errors and forward to the middleware
try {
customerService.addNew(req.body).then((result) => {
res.status(200).json(result);
}).catch((error) => {
next(error)
});
}
catch (error) {
next(error);
}
// Error handling middleware, we delegate the handling to the centralized error handler
app.use((err, req, res, next) => {
errorHandler.handleError(err).then((isOperationalError) => {
if (!isOperationalError)
next(err);
});
});
Code example – handling errors within a dedicated object
function errorHandler(){
this.handleError = function (error) {
return
logger.logError(err).then(sendMailToAdminIfCritical).then(saveInOpsQueueIfCritical).then(determineIfOperationalError)
;
}
}
Code Example – Anti Pattern: handling errors within the
middleware
// middleware handling the error directly, who will handle Cron jobs and testing errors?
app.use((err, req, res, next) => {
logger.logError(err);
if(err.severity == errors.high)
mailer.sendMail(configuration.adminMail, "Critical error occured", err);
if(!err.isOperational)
next(err);
});
"Sometimes lower levels can’t do anything useful except
propagate the error to their caller"
“…You may end up handling the same error at several levels of the stack. This happens when lower levels can’t do
anything useful except propagate the error to their caller, which propagates the error to its caller, and so on. Often,
only the top-level caller knows what the appropriate response is, whether that’s to retry the operation, report an
error to the user, or something else. But that doesn’t mean you should try to report all errors to a single top-level
callback, because that callback itself can’t know in what context the error occurred…”
“……You should set useful properties in error objects, but use such properties
consistently. And, don’t cross the streams: HTTP errors have no place in your
database code. Or for browser developers, Ajax errors have a place in the code
that talks to the server, but not code that processes Mustache templates…”
TL;DR: Let your API callers know which errors might come in return
so they can handle these thoughtfully without crashing. This is
usually done with REST API documentation frameworks like
Swagger
“We’ve talked about how to handle errors, but when you’re writing a
new function, how do you deliver errors to the code that called your
function? …If you don’t know what errors can happen or don’t know
what they mean, then your program cannot be correct except by
accident. So if you’re writing a new function, you have to tell your
callers what errors can happen and what they mean…”
// Assuming developers mark known operational errors with error.isOperational=true, read best practice #3
process.on('uncaughtException', function(error) {
errorManagement.handler.handleError(error);
if(!errorManagement.handler.isTrustedError(error))
process.exit(1)
});
“…By the very nature of how throw works in JavaScript, there is almost never any way to safely “pick
up where you left off”, without leaking references, or creating some other sort of undefined brittle
state. The safest way to respond to a thrown error is to shut down the process. Of course, in a
normal web server, you might have many connections open, and it is not reasonable to abruptly shut
those down because an error was triggered by someone else. The better approach is to send an
error response to the request that triggered the error while letting the others finish in their normal
time, and stop listening for new requests in that worker.”
From
Node.js official documentation
2.7 Use a mature logger to increase error visibility
var options = {
from: new Date - 24 * 60 * 60 * 1000,
until: new Date,
limit: 10,
start: 0,
order: 'desc',
fields: ['message']
};
1. Timestamp each log line. This one is pretty self-explanatory – you should be able to tell when
each log entry occurred.
2. Logging format should be easily digestible by humans as well as machines.
3. Allows for multiple configurable destination streams. For example, you might be writing trace
logs to one file but when an error is encountered, write to the same file, then into error file and
send an email at the same time…
Example:
UpTimeRobot.Com –
Website monitoring
dashboard
2.9 Discover errors and downtime using APM products
Example:
AppDynamics.Com –
end to end monitoring
combined with code
instrumentation
2.10 Catch unhandled promise rejections
DAL.getUserById(1).then((johnSnow) => {
// this error will just vanish
if(johnSnow.isAlive == false)
throw new Error('ahhhh');
});
Code example: Catching unresolved and rejected promises
I don’t know about you, but my answer is that I’d expect all of them to print an error.
However, the reality is that a number of modern JavaScript environments won’t
print errors for any of them.The problem with being human is that if you can make
a mistake, at some point you will. Keeping this in mind, it seems obvious that we
should design things in such a way that mistakes hurt as little as possible, and that
means handling errors by default, not discarding them.
function addNewMember(newMember)
{
// assertions come first
Joi.assert(newMember, memberSchema); //throws if validation fails
// other logic here
}
Anti-pattern: no validation yields nasty bugs
// if the discount is positive let's then redirect the user to pring his discount coupons
function redirectToPrintDiscount(httpResponse, member, discount)
{
if(discount != 0)
httpResponse.redirect(`/discountPrintView/${member.id}`);
}
redirectToPrintDiscount(httpResponse, someMember);
// forgot to pass the parameter discount, why the heck was the user redirected to the discount screen?
"You should throw these errors immediately"
TL;DR: ESLint is the de-facto standard for checking possible code errors and
fixing code style, not only to identify nitty-gritty spacing issues but also to
detect serious code anti-patterns like developers throwing errors without
classification. Though ESLint can automatically fix code styles, other tools like
prettier and beautify are more powerful in formatting the fix and work in
conjunction with ESLint
TL;DR: On top of ESLint standard rules that cover vanilla JS only, add
Node-specific plugins like eslint-plugin-node, eslint-plugin-mocha and
eslint-plugin-node-security
// Do
function someFunction() {
// code block
}
// Avoid
function someFunction()
{
// code block
}
3.4 Don't Forget the Semicolon
TL;DR: Require modules at the beginning of each file, before and outside of any
functions. This simple best practice will not only help you easily and quickly tell
the dependencies of a file right at the top but also avoids a couple of potential
problems
// Do
module.exports.SMSProvider = require('./SMSProvider');
module.exports.SMSNumberResolver = require('./SMSNumberResolver');
// Avoid
module.exports.SMSProvider = require('./SMSProvider/SMSProvider.js');
module.exports.SMSNumberResolver = require('./SMSNumberResolver/SMSNumberResolver.js');
3.10 Use the === operator
TL;DR: Node 8 LTS now has full support for Async-await. This is a new way of
dealing with asynchronous code which supersedes callbacks and promises.
Async-await is non-blocking, and it makes asynchronous code look synchronous.
The best gift you can give to your code is using async-await which provides a
much more compact and familiar code syntax like try-catch
Otherwise: You may spend long days on writing unit tests to find out
that you got only 20% system coverage
4.2 Detect code issues with a linter
TL;DR: Use a code linter to check basic quality and detect anti-
patterns early. Run it before any test and add it as a pre-commit git-
hook to minimize the time needed to review and correct any issue.
Also check Section 3 on Code Style Practices
Otherwise: Choosing some niche vendor might get you blocked once you need
some advanced customization. On the other hand, going with Jenkins might
burn precious time on infrastructure setup
4.3 Carefully choose your CI platform (Jenkins vs CircleCI vs
Travis vs Rest of the world)
One Paragraph Explainer
The CI world used to be the flexibility of Jenkins vs the simplicity of SaaS vendors.
The game is now changing as SaaS providers like CircleCI and Travis offer robust
solutions including Docker containers with minimum setup time while Jenkins tries
to compete on 'simplicity' segment as well. Though one can setup rich CI solution in
the cloud, should it required to control the finest details Jenkins is still the platform
of choice. The choice eventually boils down to which extent the CI process should be
customized: free and setup free cloud vendors allow to run custom shell commands,
custom docker images, adjust the workflow, run matrix builds and other rich
features. However, if controlling the infrastructure or programming the CI logic using
a formal programming language like Java is desired - Jenkins might still be the
choice. Otherwise, consider opting for the simple and setup free cloud option
Code Example – a typical cloud CI configuration. Single .yml file
and that's it
version: 2
jobs:
build:
docker:
- image: circleci/node:4.8.2
- image: mongo:3.4.4
steps:
- checkout
- run:
name: Install npm wee
command: npm install
test:
docker:
- image: circleci/node:4.8.2
- image: mongo:3.4.4
steps:
- checkout
- run:
name: Test
command: npm test
- run:
name: Generate code coverage
command: './node_modules/.bin/nyc report --reporter=text-lcov'
- store_artifacts:
path: coverage
prefix: coverage
4.3 Carefully choose your CI platform (Jenkins vs CircleCI vs
Travis vs Rest of the world)
Circle CI -
almost zero
setup cloud
CI
4.3 Carefully choose your CI platform (Jenkins vs CircleCI vs
Travis vs Rest of the world)
Jenkins -
sophisticated
and robust
CI
4.4 Constantly inspect for vulnerable dependencies
TL;DR: Even the most reputable dependencies such as
Express have known vulnerabilities. This can get easily
tamed using community and commercial tools such as 🔗
nsp that can be invoked from your CI on every build
TL;DR: End to end (e2e) testing which includes live data used to be the weakest
link of the CI process as it depends on multiple heavy services like DB. Docker-
compose turns this problem into a breeze by crafting production-like environment
using a simple text file and easy commands. It allows crafting all the dependent
services, DB and isolated network for e2e testing. Last but not least, it can keep a
stateless environment that is invoked before each test suite and dies right after
Monitoring
example: AWS
cloudwatch default
dashboard. Hard to
extract in-app
metrics
5.1. Monitoring!
Monitoring
example:
StackDriver
default
dashboard. Hard
to extract in-app
metrics
5.1. Monitoring!
Monitoring
example:
Grafana as the
UI layer that
visualizes raw
data
“What Other Bloggers Say”
“…We recommend you to watch these signals for all of your services:
Error Rate: Because errors are user facing and immediately affect
your customers. Response time: Because the latency directly affects
your customers and business. Throughput: The traffic helps you to
understand the context of increased error rates and the latency too.
Saturation: It tells how “full” your service is. If the CPU usage is 90%,
can your system handle more traffic? …”
Visualization
Example: Kibana
(part of the Elastic
stack) facilitates
advanced
searching on log
content
5.2. Increase transparency using smart logging
Visualization
Example: Kibana
(part of the
Elastic stack)
visualizes data
based on logs
“Logger Requirements”
1. Timestamp each log line. This one is pretty self-explanatory – you should be able
to tell when each log entry occurred.
2. Logging format should be easily digestible by humans as well as machines.
3. Allows for multiple configurable destination streams. For example, you might be
writing trace logs to one file but when an error is encountered, write to the same file,
then into error file and send an email at the same time…
Strong Loop
5.3. Delegate anything possible (e.g. gzip, SSL) to a
reverse proxy
It’s very tempting to cargo-cult Express and use its rich middleware offering for
networking related tasks like serving static files, gzip encoding, throttling
requests, SSL termination, etc. This is a performance kill due to its single
threaded model which will keep the CPU busy for long periods (Remember,
Node’s execution model is optimized for short tasks or async IO related tasks).
A better approach is to use a tool that expertise in networking tasks – the most
popular are nginx and HAproxy which are also used by the biggest cloud
vendors to lighten the incoming load on node.js processes.
Nginx Config Example – Using nginx to compress server
responses
# configure gzip compression
gzip on;
gzip_comp_level 6;
gzip_vary on;
# configure upstream
upstream myApplication {
server 127.0.0.1:3000;
server 127.0.0.1:3001;
keepalive 64;
}
“…It’s very easy to fall into this trap – You see a package like Express and think “Awesome! Let’s get started”
– you code away and you’ve got an application that does what you want. This is excellent and, to be honest,
you’ve won a lot of the battle. However, you will lose the war if you upload your app to a server and have it
listen on your HTTP port because you’ve forgotten a very crucial thing: Node is not a web server. As soon as
any volume of traffic starts to hit your application, you’ll notice that things start to go wrong: connections
are dropped, assets stop being served or, at the very worst, your server crashes. What you’re doing is
attempting to have Node deal with all of the complicated things that a proven web server does really well.
Why reinvent the wheel? This is just for one request, for one image and bearing in mind this is the memory
that your application could be used for important stuff like reading a database or handling complicated
logic; why would you cripple your application for the sake of convenience?”
“Although express.js has built-in static file handling through some connect
middleware, you should never use it. Nginx can do a much better job of handling
static files and can prevent requests for non-dynamic content from clogging our
node processes…”
Otherwise: QA will thoroughly test the code and approve a version that will
behave differently at production. Even worse, different servers at the same
production cluster might run different code
5.4. Lock dependencies
One Paragraph Explainer
Your code depends on many external packages, let’s say it ‘requires’ and use momentjs-2.1.4, then
by default when you deploy to production NPM might fetch momentjs 2.1.5 which unfortunately
brings some new bugs to the table. Using NPM config files and the argument –save-exact=true
instructs NPM to refer to the exact same version that was installed so the next time you run npm
install (in production or within a Docker container you plan to ship forward for testing) the same
dependent version will be fetched. An alternative and popular approach is using a .shrinkwrap file
(easily generated using NPM) that states exactly which packages and versions should be installed
so no environment can get tempted to fetch newer versions than expected.
{
"name": "A",
"dependencies": {
"B": {
"version": "0.0.1",
"dependencies": {
"C": {
"version": "0.1.0"
}
}
}
}
}
Code example: NPM 5 dependencies lock file – package.json
{
"name": "package-name",
"version": "1.0.0",
"lockfileVersion": 1,
"dependencies": {
"cacache": {
"version": "9.2.6",
"resolved": "https://registry.npmjs.org/cacache/-/cacache-9.2.6.tgz",
"integrity": "sha512-YK0Z5Np5t755edPL6gfdCeGxtU0rcW/DBhYhYVDckT+7AFkCCtedf2zru5NRbBLFk6e7Agi/RaqTOAfiaipUfg=="
},
"duplexify": {
"version": "3.5.0",
"resolved": "https://registry.npmjs.org/duplexify/-/duplexify-3.5.0.tgz",
"integrity": "sha1-GqdzAC4VeEV+nZ1KULDMquvL1gQ=",
"dependencies": {
"end-of-stream": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/end-of-stream/-/end-of-stream-1.0.0.tgz",
"integrity": "sha1-1FlucCc0qT5A6a+GQxnqvZn/Lw4="
}
}
}
}
}
5.5. Guard process uptime using the right tool
TL;DR: The process must go on and get restarted upon
failures. For simple scenarios, ‘restarter’ tools like PM2 might
be enough but in today ‘dockerized’ world – a cluster
management tools should be considered as well
“... In development, you started your app simply from the command line with node
server.js or something similar. But doing this in production is a recipe for disaster. If
the app crashes, it will be offline until you restart it. To ensure your app restarts if it
crashes, use a process manager. A process manager is a “container” for
applications that facilitate deployment, provides high availability, and enables you to
manage the application at runtime.”
From the
Express Production Best Practices
“What Other Bloggers Say”
TL;DR: At its basic form, a Node app runs on a single CPU core while all other are
left idling. It’s your duty to replicate the Node process and utilize all CPUs – For
small-medium apps you may use Node Cluster or PM2. For a larger app consider
replicating the process using some Docker cluster (e.g. K8S, ECS) or deployment
scripts that are based on Linux init system (e.g. systemd)
Otherwise: Your app will likely utilize only 25% of its available resources(!) or
even less. Note that a typical server has 4 CPU cores or more, naive deployment
of Node.js utilizes only 1 (even using PaaS services like AWS beanstalk!)
5.6. Utilize all CPU cores
Comparison:
Balancing using
Node’s cluster vs
nginx
“What Other Bloggers Say”
“... The second approach, Node clusters, should, in theory, give the
best performance. In practice, however, distribution tends to be very
unbalanced due to operating system scheduler vagaries. Loads have
been observed where over 70% of all connections ended up in just two
processes, out of a total of eight ...”
From the
Node.js documentation
“What Other Bloggers Say”
“... Clustering is made possible with Node’s cluster module. This enables a master
process to spawn worker processes and distribute incoming connections among
the workers. However, rather than using this module directly, it’s far better to use
one of the many tools out there that do it for you automatically; for example node-
pm or cluster-service ...”
StrongLoop
“What Other Bloggers Say”
“... Node cluster is simple to implement and configure, things are kept
inside Node’s realm without depending on other software. Just
remember your master process will work almost as much as your
worker processes and with a little less request rate than the other
solutions ...”
APM example – a
commercial product
that visualizes cross-
service app
performance
5.8. Discover errors and downtime using APM products
APM example – a
commercial product
that emphasizes the
user experience
score
5.8. Discover errors and downtime using APM products
APM example – a
commercial
product that
highlights slow
code paths
5.9. Make your code production-ready
... ”As we already learned, in Node.js JavaScript is compiled to native code by V8.
The resulting native data structures don’t have much to do with their original
representation and are solely managed by V8. This means that we cannot actively
allocate or deallocate memory in JavaScript. V8 uses a well-known mechanism
called garbage collection to address this problem.”
... “fault, Node.js will try to use about 1.5GBs of memory, which has to be capped when running on
systems with less memory. This is the expected behavior as garbage collection is a very costly
operation. The solution for it was adding an extra parameter to the Node.js process: node –
max_old_space_size=400 server.js –production ” “Why is garbage collection expensive? The V8
JavaScript engine employs a stop-the-world garbage collector mechanism. In practice, it means
that the program stops execution while garbage collection is in progress.”
“…In development, you can use res.sendFile() to serve static files. But don’t do
this in production, because this function has to read from the file system for
every file request, so it will encounter significant latency and affect the overall
performance of the app. Note that res.sendFile() is not implemented with the
sendfile system call, which would make it far more efficient. Instead, use serve-
static middleware (or something equivalent), that is optimized for serving files
for Express apps. An even better option is to use a reverse proxy to serve static
files; see Use a reverse proxy for more information…”
StrongLoop
5.13. Use tools that automatically detect vulnerabilities
“...Using to manage your application’s dependencies is powerful and convenient. But the
packages that you use may contain critical security vulnerabilities that could also affect your
application. The security of your app is only as strong as the “weakest link” in your dependencies.
Fortunately, there are two helpful tools you can use to ensure the third-party packages you use:
and requireSafe. These two tools do largely the same thing, so using both might be overkill, but
“better safe than sorry” are words to live by when it comes to security...”
From the blog
StrongLoop
5.14. Assign ‘TransactionId’ to each log statement
A typical log is a warehouse of entries from all components and requests. Upon detection of some suspicious
line or error, it becomes hairy to match other lines that belong to the same specific flow (e.g. the user “John”
tried to buy something). This becomes even more critical and challenging in a microservice environment when
a request/transaction might span across multiple computers. Address this by assigning a unique transaction
identifier value to all the entries from the same request so when detecting one line one can copy the id and
search for every line that has similar transaction Id. However, achieving this In Node is not straightforward as
a single thread is used to serve all requests –consider using a library that that can group data on the request
level – see code example on the next slide. When calling other microservice, pass the transaction Id using an
HTTP header like “x-transaction-id” to keep the same context.
Code example: typical Express configuration
// when receiving a new request, start a new isolated context and set a transaction Id. The following example is using the NPM library
continuation-local-storage to isolate requests
// Now any other service or components can have access to the contextual, per-request, data
class someService {
getById(id) {
logger.info(“Starting to get something by Id”);
// other logic comes here
}
}
// The logger can now append the transaction-id to each entry so that entries from the same request will have the same value
class logger {
info (message)
{console.log(`${message} ${session.get('transactionId')}`);}
}
5.15. Set NODE_ENV=production
“...In Node.js there is a convention to use a variable called NODE_ENV to set the
current mode. We see that it, in fact, reads NODE_ENV and defaults to
‘development’ if it isn’t set. We clearly see that by setting NODE_ENV to production
the number of requests Node.js can handle jumps by around two-thirds while the
CPU usage even drops slightly. Let me emphasize this: Setting NODE_ENV to
production makes your application 3 times faster!”
dynatrace
5.16. Design automated, atomic and zero-downtime
deployments
Otherwise: Long deployments -> production down time & human-related error ->
team unconfident and in making deployment -> less deployments and features
5.17. Use an LTS release of Node.js
“Linting doesn’t have to be just a tool to enforce pedantic rules about whitespace,
semicolons or eval statements. ESLint provides a powerful framework for
eliminating a wide variety of potentially dangerous patterns in your code (regular
expressions, input validation, and so on). I think it provides a powerful new tool
that’s worthy of consideration by security-conscious JavaScript developers.”
Adam Baldwin
6.2. Limit concurrent requests using a balancer or a
middleware
“Rate limiting can be used for security purposes, for example to slow down
brute‑force password‑guessing attacks. It can help protect against DDoS attacks
by limiting the incoming request rate to a value typical for real users, and (with
logging) identify the targeted URLs. More generally, it is used to protect upstream
application servers from being overwhelmed by too many user requests at the
same time.”
From the
NGINX blog
6.3 Extract secrets from config files or use NPM package that
encrypts them
You can also configure ssl/tls on your reverse proxy pointing to your
application for example using nginx or HAProxy.
Code Example – Enabling SSL/TLS using the Express
framework
TL;DR: When comparing secret values or hashes like HMAC digests, you should
use the crypto.timingSafeEqual(a, b) function Node provides out of the box since
Node.js v6.6.0. This method compares two given objects and keeps comparing
even if data does not match. The default equality comparison methods would
simply return after a character mismatch, allowing timing attacks based on the
operation length.
Otherwise: Using default equality comparison operators you might expose critical
information based on the time taken to compare two objects
Generating random strings using Node.js
“... it’s not just using the right hashing algorithm. I’ve talked extensively about how
the right tool also includes the necessary ingredient of “time” as part of the
password hashing algorithm and what it means for the attacker who’s trying to
crack passwords through brute-force.”
Max McCarty
6.9. Use middleware that sanitizes input and output
{
"$schema": "http://json-schema.org/draft-06/schema#",
"title": "Product",
"description": "A product from Acme's catalog",
"type": "object",
"properties": {
"name": {
"description": "Name of the product",
"type": "string"
},
"price": {
"type": "number",
"exclusiveMinimum": 0
}
},
"required": ["id", "name", "price"]
}
Example - Validating an entity using JSON-Schema
class Product {
validate() {
var v = new JSONValidator();
// The validator is a generic middleware that gets the entity it should validate and takes care to return
// HTTP status 400 (Bad Request) should the body payload validation fail
router.post("/" , **validator(Product.validate)**, async (req, res, next) => {
// route handling code goes here
});
“What Other Bloggers Say”
“Validating user input is one of the most important things to do when it comes to
the security of your application. Failing to do it correctly can open up your
application and users to a wide range of attacks, including command injection,
SQL injection or stored cross-site scripting.”
Nemeth Gergley
6.11. Support blacklisting JWT tokens
TL;DR: There are common scenario where nodejs runs as a root user with
unlimited permissions. For example this is the default behaviour in Docker
containers. It's recommended to create a non-root user and always run the
process on this user behalf by invoking the container with the flag "-u username“
Otherwise: An attacker who manages to run a script on the server gets unlimited
power over the local machine (e.g. change iptable and re-route traffic to his
server)
6.13. Run Node.js as non-root user
FROM node:latest
COPY package.json .
RUN npm install
COPY . .
EXPOSE 3000
USER node
CMD ["node", "server.js"]
“What Other Bloggers Say”
“By default, Docker runs container as root which inside of the container can pose
as a security issue. You would want to run the container as an unprivileged user
wherever possible. The node images provide the node user for such purpose. The
Docker Image can then be run with the node user in the following way:”
eyalzek
6.14. Limit payload size using a reverse-proxy or a middleware
app.use(express.json({ limit: '300kb' })); // body-parser defaults to a body size limit of 100kb
// Check if request payload content-type matches json, because body-parser does not check for content types
if (!req.is('json')) {
return res.sendStatus(415); // -> Unsupported media type if request doesn't have JSON body
}
res.send('Hooray, it worked!');
});
http {
...
# Limit the body size for ALL incoming requests to 1 MB
client_max_body_size 1m;
}
server {
...
# Limit the body size for incoming requests to this specific server block to 1 MB
client_max_body_size 1m;
}
location /upload {
...
# Limit the body size for incoming requests to this route to 1 MB
client_max_body_size 1m;
}
6.15. Avoid JS eval statements
TL;DR: eval may be used to evaluate javascript code during run-time, but it is
not just a performance concern but also an important security concern due to
malicious javascript code that may be sourced from user input. Another
language feature that should be avoided is new Function constructor.
setTimeout and setInterval should never be passed dynamic javascript code
either.
Otherwise: Malicious javascript code finds a way into a text passed into eval or
other real-time evaluating javascript language functions, it will gain complete
access to javascript permissions on the page, often manifesting as an XSS
attack.
6.15. Avoid JS eval statements
“The eval() function is perhaps of the most frowned upon JavaScript pieces from a
security perspective. It parses a JavaScript string as text, and executes it as if it
were a JavaScript code. Mixing that with untrusted user input that might find it’s
way to eval() is a recipe for disaster that can end up with server compromise.”
Liran Tal
6.16. Prevent malicious RegEx from overloading your single
thread execution
The risk that is inherent with the use of Regular Expressions is the
computational resources that require to parse text and match a given
pattern. For the Node.js platform, where a single-thread event-loop is
dominant, a CPU-bound operation like resolving a regular expression
pattern will render the application unresponsive. Avoid regex when
possible or defer the task to a dedicated library like validator.js, or
safe-regex to check if the regex pattern is safe.
Some OWASP examples for vulnerable regex patterns:
•(a|aa)+
•([a-zA-Z]+)*
Code Example – using promises to catch
errors
“Often, programmers will use RegEx to validate that an input received from a user
conforms to an expected condition. A vulnerable Regular Expression is known as
one which applies repetition to a repeating capturing group, and where the string
to match is composed of a suffix of a valid matching pattern plus characters that
aren't matching the capturing group.”
Liran Tal
6.17. Avoid module loading require(someVariable) using a
variable
TL;DR: Avoid requiring/importing another file with a path that was given as
parameter due to the concern that it could have originated from user input.
This rule can be extended for accessing files in general (i.e. fs.readFile()) or
other sensitive resource access with dynamic variables originating from user
input.
Otherwise: Malicious user input could find its way to a parameter that is used
to require tampered files, for example a previously uploaded file on the
filesystem, or access already existing system files.
Code example
// secure
const uploadHelpers = require('./helpers/upload');
6.18. Run unsafe code in a sandbox
Exposing application error details to the client in production should be avoided due to risk
of exposing sensitive application details such as server filepaths, third party modules in
use, and other internal workings of the application which could be exploited by an attacker.
Express comes with a built-in error handler, which takes care of any errors that might be
encountered in the app. This default error-handling middleware function is added at the
end of the middleware function stack. If you pass an error to next() and you do not handle
it in a custom error handler, it will be handled by the built-in Express error handler; the error
will be written to the client with the stack trace. This behaviour will be true when
NODE_ENV is set to development, however when NODE_ENV is set to production, the stack
trace is not written, only the HTTP response code.
Code example
The most common setting left to default is the session name - in express-session this is connect.sid. An
attacker can use this information to identify the underlying framework of the web application as well as
module specific vulnerabilities. Changing this value to something other than the default will make it harder
to determine what session mechanism is being used.
Also in express-session, the option cookie.secure is set to false as the default value. Changing this to true
will restrict transport of the cookie to https only which provides safety from man in the middle type attacks
Code example: Setting secure cookie settings
“...Express has default cookie settings that aren’t highly secure. They can be
manually tightened to enhance security - for both an application and its user.*”
From the NodeSource blog:
From the
NodeSource blog
6.21. Modify session middleware
settings