Handling high traffic and load balancing in Node.js

The thrill of seeing your user base skyrocket will only last until your server crashes! In this article, Giridhar Talla discusses several strategies to make your Node.js applications ready to handle high traffic volumes.

Most developers build their projects without thinking about scalability issues in the future. However, as your project gains popularity, you will find it challenging to handle the high traffic and scale the app. The thrill of seeing your user base skyrocket will only last until your server crashes, and frequent downtimes may decrease the reliability of your product. In this article, I'll discuss several strategies, including vertical scaling, horizontal scaling, and load balancing, to make your Node.js applications capable of handling high traffic.

Node.js is a good choice for building scalable and reliable applications due to its event-driven and non-blocking architecture. However, imagine that your app has quickly become extremely popular. It increases user activity, also called high traffic, and can degrade the performance of the server by placing high loads on the server. It might even cause the server to crash, which results in (downtime) and takes time to fix. To keep your app responsive even during peak usage, it's important to use scaling strategies. There are two main ways to do this: "vertical scaling" and "horizontal scaling."

Bigger, better, faster, stronger

Vertical scaling is "scaling up" the resources of your server. It means upgrading the existing server by providing more memory (storage) and computation resources (CPU cores), making it bigger and faster. Vertical scaling increases the existing hardware's strength and capability to handle high loads. It also simplifies server management because only one server is in use.

However, Node.js is single-threaded, which means that if you have 16 cores in your CPU, the node.js app only uses one core to execute processes. Clustering is one of the ways to solve this issue and unlock the full potential of your CPU. Using the built-in cluster API, you can easily create and run several child processes to handle the load. Alternatively, you can avoid writing your own process manager by using the npm package called pm2 for enterprise standard applications. You can find its documentation here.

Vertical scaling also has limits, and there comes a point where your server can't handle any further upgrades. Of course, the number of CPU cores available limits the maximum number of child processes, which is why 'Horizontal scaling' is useful.

Let's multiply those servers

Horizontal scaling is spinning up multiple servers (clones) instead of limiting yourself to one. In this case, you will have more servers ready to take on the load of incoming traffic. Horizontal scaling improves performance and offers better fault tolerance and scalability.

Balancing the load

When you have multiple servers, it can get tricky to distribute the traffic evenly among the available servers. This is why load balancers are needed, like traffic cops directing incoming requests to the least busy server. It stands between your server and client, ensuring no server will be left idly waiting while others do all the work. Load balancing reduces the complexity because you don't need to manage the servers individually. It ensures smooth traffic flow and helps prevent bottlenecks. Bottlenecks occur when one server gets overwhelmed with requests. It also increases the availability of your application. If one server crashes, the load balancer gracefully redirects traffic to other servers, avoiding downtime.

How the load balancer works

The load balancer stands between your server and the client. All the traffic first hits the load balancer, and here's where the magic begins. Load balancers use various algorithms to decide which server should handle the request. There are more than ten algorithms for load balancing, and we will cover four of them in this article. These algorithms are crucial in load balancing, ensuring incoming traffic is efficiently distributed across servers to maintain performance, availability, and a smooth user experience.

1. Round robin

In this algorithm, the load balancer is dumb and does not make any decisions. It simply distributes the load across multiple servers sequentially, assigning each request to one after another. No server is left out, and the load is distributed evenly. You can find more about how it works here.

However, in situations where users stay connected for different lengths of time, some for just a few minutes while others stay for an hour, the Round Robin algorithm might not be ideal for you.

2. Smart load balancing

In smart load balancing, you should use a server capable of making decisions dynamically based on real-time collaboration between the servers and the load balancer. It's not just about directing traffic – it's about doing it intelligently to ensure optimal performance, resource utilization, and a seamless user experience.

However, being smart takes work. It requires more cost and complexity to set up a smart load balancer. This functionality cannot be achieved simply by using a single algorithm. It includes real-time monitoring and machine learning integration to decide which server should receive the requests.

3. Least connections

You can use the least connections algorithm to reduce complexity and handle traffic smoothly. This algorithm directs requests to the server with the least number of active connections. This approach ensures that no server is overwhelmed with too many requests while others remain idle. It's all about maintaining equilibrium in the group of servers.

4. Weighted round robin

The methods we’ve covered so far are suitable for cloned servers. However, you might want to mix vertical and horizontal scaling by giving some servers more resources than others. This is when the weighted method comes in. In this algorithm, servers are assigned weights based on capacity and available resources. The higher the weight, the more requests a server can handle. This approach makes sure to use servers in the most efficient manner by matching the workload with their capabilities.

In addition to these algorithms, you can find many other load-balancing algorithms here.

Health checks

Health checks are automated periodic tests that determine the operational status (which is called "health") of servers. In the world of load balancing, these health checks are crucial because they help load balance if your servers are ready to handle incoming requests. By continuously checking your servers, you can easily detect issues, such as server failures, before they impact your end users. Additionally, the load balancer will avoid sending traffic to these components.

How to implement load balancing

There are two primary kinds of load balancers: hardware and software. In most cases, you can use software load balancers. However, for bigger applications, you can use physical devices as hardware load balancers to distribute traffic across multiple servers. These often come with advanced features, such as SSL termination, caching, and health checks.

In software load balancing, there are three approaches:

  1. Create your own load balancer using Express.js. You can create one by following this guide.
  2. Use reverse proxies like NGNIX and HA-Proxy.
  3. Use third-party cloud services, such as AWS Elastic Load Balancing, Google Cloud Load Balancing, or Azure Load Balancer. They help you avoid writing code from scratch and provide multiple functionalities, including traffic distribution, health checks, continuous monitoring, and logging, for notably enhanced security. Cloud load balancers are renowned for their ease of configuration and cost-effectiveness because they are fully managed services offered by providers like AWS, Azure, or Google Cloud.

Sometimes, you might need a simple server with a round-robin algorithm and more complex handling in others. You can choose any method after carefully analyzing your requirements.


To wrap up, we've explored a variety of scaling strategies in this article. You've discovered vertical scaling, where you boost a server's power with extra resources, and horizontal scaling, which involves making clones of the server. Managing multiple servers requires a traffic controller, and that's where the load balancer steps in. We have also discussed the dumb and smart ways of implementing load balancers. Smart load balancing may sound complex, but you can use other algorithms to make your Node.js application capable of handling high traffic. Make sure to choose an algorithm that suits your load balancing needs, ensuring your application remains responsive and competent as your user base grows.

What to do next:
  1. Try Honeybadger for FREE
    Honeybadger helps you find and fix errors before your users can even report them. Get set up in minutes and check monitoring off your to-do list.
    Start free trial
    Easy 5-minute setup — No credit card required
  2. Get the Honeybadger newsletter
    Each month we share news, best practices, and stories from the DevOps & monitoring community—exclusively for developers like you.
    author photo

    Giridhar Talla

    Giridhar loves programming, learning, and building new things. He enjoys working with both JavaScript and Python.

    More articles by Giridhar Talla
    Stop wasting time manually checking logs for errors!

    Try the only application health monitoring tool that allows you to track application errors, uptime, and cron jobs in one simple platform.

    • Know when critical errors occur, and which customers are affected.
    • Respond instantly when your systems go down.
    • Improve the health of your systems over time.
    • Fix problems before your customers can report them!

    As developers ourselves, we hated wasting time tracking down errors—so we built the system we always wanted.

    Honeybadger tracks everything you need and nothing you don't, creating one simple solution to keep your application running and error free so you can do what you do best—release new code. Try it free and see for yourself.

    Start free trial
    Simple 5-minute setup — No credit card required

    Learn more

    "We've looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release."
    — Michael Smith, Cofounder & CTO of YvesBlue

    Honeybadger is trusted by top companies like:

    “Everyone is in love with Honeybadger ... the UI is spot on.”
    Molly Struve, Sr. Site Reliability Engineer, Netflix
    Start free trial
    Are you using Sentry, Rollbar, Bugsnag, or Airbrake for your monitoring? Honeybadger includes error tracking with a whole suite of amazing monitoring tools — all for probably less than you're paying now. Discover why so many companies are switching to Honeybadger here.
    Start free trial