By Ali Hodroj, Eliza Croen
It’s been long known that almost half of a retail website’s shoppers will wait no more than 3 seconds for its content to render before they abandon it. Today, in an omni-channel retailing world where the online and offline channels converge through mobile shopping and in-store pickup, falling short of meeting the 3-second rule not only has a detrimental impact on web and mobile shopping abandonment rate, but also a significant loss of brick-and-mortar revenue.
With 2015 shaping up to be another banner year for e-retail breaking the $300B web sales record of 2014, major retailers continue to chase the ever-elusive 100% web/mobile channel uptime throughout peak holiday shopping events. Recent numbers from the National Retail Federation show 103 million online shoppers spending billions throughout this past Black Friday holiday weekend. In 2015 and beyond, the challenge of high performance and high availability internet retailing is a make or break factor for every retailer.
This Black Friday, we’ve seen retailers from Target to PayPal, Victoria’s Secret and others experiencing downtime or slow site performance throughout an action-packed holiday weekend into Cyber Monday. Neiman Marcus was down for a costly 12 hours, while other retailers, such as Jet.com and Newegg experienced page load times that exceeded 10 seconds.
At GigaSpaces, we’ve worked with many top retailers, and have hundreds of Fortune 1000 customers using our highly scalable in-memory computing platform. As part of serving these retailers through professional services, field engineering, and on-site holiday support, we’ve gathered a vast amount of expertise and best practices. Based on that, below are the top 5 recommendations a retailer should keep in mind before the web traffic onslaught of Black Friday and Cyber Monday.
Top 5 recommendations to prepare a high performance infrastructure for Black Friday weekend:
1. Plan your scalability units, architect and partition to scale those horizontally
An eCommerce system’s scalability is limited when it comes to monolithic database stores for inventory, product, and orders. When architecting your eCommerce backend services, you should consider partitioning your data and business logic across an in-memory data grid to reflect your horizontal eCommerce scalability axis: product page views per session, inventory lookups and allocations per user group, product lookups per category….etc. Scaling to higher traffic across channels can be accomplished by adding more data grid partitions.
2. Minimize or eliminate contention during lookup, orders, and wherever else
Building a highly scalable system for low latency starts by eliminating synchronous calls as much as possible, while embracing asynchrony at I/O boundaries. Synchronous calls lead to excessive thread creation that significantly degrade system performance during heavy bursts of traffic. Push your database to the background by introducing a partitioned in-memory data grid as the system of record while asynchronously persisting to the database. In an eCommerce system, this decouples the memory I/O latency of an inventory allocation call from the underlying database disk I/O latency.
3. Avoid sticky sessions as much as possible
Keeping session affinity at the application server level guarantees that the system will not be able to scale under extreme loads. The web-tier of an eCommerce system should be able to call any application server and have both its read-type (product lookup) and write-type (shopping cart allocation) operations available consistently from any application server. With GigaSpaces, a shopping cart session (HTTP Session) is delegated through servlet facade to an in-memory data grid (distributed cache) that can be accessible from any application server, thereby eliminating session stickiness.
4. Consider high availability at infrastructure tiers, strive for higher across data centers
In the words of Netflix’s Adrian Cockroft: “scale breaks hardware, speed breaks software; speed at scale breaks everything”. This is all too common, especially during extreme load events like Black Friday or Cyber Monday. One should plan for fault tolerance and redundancy both at the data shard level (JVM-level redundancy), application server level (machine-level redundancy), availability zone (hypervisor/network subnet), and finally at the data center level. Retailers utilizing GigaSpaces XAP enjoy the flexibility of spreading their redundancy across machines, hypervisors, and clouds/data centers. The diagram below outlines common high availability approaches with XAP:
5. Embrace your vendor’s site reliability field engineering teams
Surviving Black Friday is not a one-team show. It is in your vendor’s best interest that you experience a positive holiday event, so rely on them to get it done. The days of silo’d development, operations, and vendor support teams are nearing their end. Every retailer working with GigaSpaces utilizes an on-site team of site reliability field engineers that monitor the system using state of the art DevOps automation and monitoring tools to proactively eliminate bottlenecks and outages during peak volume events.
These five steps are the secret sauce behind our customers’ holiday season success. Year after year, XAP users pass the holiday season with flying colors, maintaining 100% uptime, avoiding system glitches and keeping top speed site performance through peak loads. In 2014, GigaSpaces customers experienced a 139% increase in sales thanks in large part to XAP in-memory computing technology and GigaSpaces staff providing support prior to and throughout each holiday shopping event.
May your holiday season be merry, bright and highly available.