<<

Communities Performance and Scale Best Practices (for Customers) Frank Leahy, VP Performance Engineering, Community July 2020, v1.3 Forward-Looking Statements

Statement under the Private Securities Litigation Reform Act of 1995:

This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.

The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site.

Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements. Performance vs Scale

● Performance - the time it takes to load a single page in the desktop browser or on a mobile device ● Scale - the ability of a system to serve pages under increasing load while maintaining page-level performance

Before you ship, you’ll want to be sure that your implementation is performant, and that it scales. Terminology

● Performance - the time it takes to load a single ● OOTB - Out of the , what’s available page in the desktop browser or on a mobile without doing any custom coding device ● Arrival Rate - aka Load. The RPM (requests ● Scale - the ability of a system to serve pages per minute) that will be required of the system. under increasing load while maintaining This is typically defined on a page by page page-level performance basis, e.g. arrival rate to the home page, to the search page, to the article page, to the cart ● Throughput - number of requests per unit time page, checkout page, etc. (requests per minute or requests per second) ● Sustained Load - the average load during the ● Response Time - time it takes the system to heaviest parts of the day, in RPM. respond. Depending on the context can also include the time to deliver the results over the ● Peak Load - the highest expected daily, users’ network connection, or render the page weekly, monthly, or seasonal load. in the browser or mobile phone. ● Org Shape - number of objects in the org, e.g. ● SLA - service level agreement number of customers, accounts, products, ○ Perf: “the page will load in under 3 seconds” carts, etc. ○ Scale: “we can handle 10,000 requests/minute” A Little Background on Communities Perf and Scale

● We have tuned all of the OOTB Community Components so that they have good performance, and so that they scale. ○ There are about 40 components, from Feeds, to KB Articles, Top Users, etc. ● But, as soon as you start customizing your Community -- whether it’s using non-Communities objects, or using APEX, or building custom Lightning components -- the OOTB performance and scale numbers no longer apply. ○ Because the custom code may not be performant Performance and Scale Best Practices ● Draw a full system diagram which includes all future features, systems (internal and external) and users (web, mobile, API, etc.) that will touch Salesforce

2. My Web 3. My App

1. eCommerce 4. Assist Channels API

Salesforce B2B & B2C Service Console

API

5. System to System Integration Performance and Scale Best Practices ● Estimate load levels for each part of the system. Examples... ○ State assumptions: 8 hour day. Sustained = avg during middle 3 hours. Peak = monthly promos.

Sustained: 100 pv/sec Sustained: 200 API calls/sec Peak: 1000 pv/sec Peak: 400 API calls/sec SLA: < 3 seconds/page SLA: < 300ms/request

Sustained: 10 order/min 2. My Web 3. My App Sustained: 5,000 agents taking 4 calls/hour Peak: 100 orders/min Peak: 10,000 agents taking 6 calls/hour SLA: < 10 seconds/order SLA: < 5 seconds/page 1. eCommerce 4. Assist Channels API

Salesforce B2B & B2C Service Console

Sustained: 2M API calls/day via Mulesoft (8 hour day) API Peak: 10M API calls/day with 50% in 2 hours SLA: < 500ms 5. System to System Integration Performance and Scale Best Practices

● Break down usage to the next level. Examples... My App Usage: My Web Page Usage: - Read /account 25% - Home 40% - Read /assets 25% - Search 8% - Read /users 25% - Product List 10% - Create any of above 10% - Product Detail 20% - Update any of above 15% - Create Case 2% 2. My Web 3. My App - View Case 3% Assist Channels Usage: - Other 17% (features go here) eCommerce Usage: 1. eCommerce 4. Assist (features go here) Channels API

Salesforce B2B & B2C Service Console

API Usage: API (/endpoints go here)

5. System to System Integration Performance & Scale Considerations

● Important Performance Factors ○ Size and shape of org (amount and type of data) ○ Number and types of users (CC vs CC+/PRM license type) ○ Data complexity (roles and sharing rules if any, custom objects, etc.) ○ Process complexity (e.g. complex apex calls made by customer code) ○ User flows (number of pages viewed per second, and resulting apex calls made by customer code) ○ Expected load levels (both sustained and peak) ● Important Performance Metrics ○ Transaction throughput (tps - transactions per second) ○ Response times (median, p95) ○ App and Db CPU % (median, p95) ● The two most important items are almost always Load and Db CPU Performance Best Practices - Overview

● Set up a performance and scale program at the start of the development cycle ○ Why? Because if you wait until you’re about to go live, it will probably be too late to fix. ○ Decide on performance, scale and SLA requirements ○ Create a test plan ○ Run tests hourly or daily, looking for changes in key metrics such as response times and throughput. ○ Understand and fix issues right away.

○ Do this from Day 1 Performance Best Practices - Page Level Testing

● We recommend you run page level tests regularly (preferably nightly) ○ Why: So you’re not surprised about your site’s front-end performance ○ How: Set up a test community and run performance tests nightly using a continuous deployment model ○ Customize: As you create custom components, add them to the page or pages where they will be used, and see how that impacts page load times ○ Watch: For changes in page load times as you add new features and components to your pages ○ CPO: Use Community Page Optimizer to look deeper into your component code Performance Best Practices - Regular Scale Testing

● We recommend you run low load scale tests regularly (preferably nightly) ○ Why: So you’re not surprised about your site’s back-end performance ○ How: Set up a test community and run low load (e.g. 5 concurrent users for 10 minutes) tests nightly using a continuous deployment model ○ Watch: For changes in key metrics such as response times and throughput. If something gets slower, you are able to look for the cause immediately rather than trying to find the issue near the release ○ Event Logs: use Event Logs to investigate deeper into which calls are the slowest on the back-end. Performance Best Practices - Scale Testing

● Estimate data size and shape ○ number of accounts, users, objects, feeds, groups, articles, topics, etc. ○ any complex relationships between objects ○ any complex role hierarchies and sharing rules ● Build a representative org on sandbox ● Estimate load ○ Use existing site data (if available) as a starting point ○ Build user personas (admin, browser, searcher, shopper, buyer, etc.) ○ Build a site map and fill out likely page flows for each persona ○ Estimate user arrival rates, login rates, which pages are being viewed, page views per session, etc. ● Write persona based load generation scripts using a tool like LoadRunner or JMeter ● Run the test nightly at a low-ish level (e.g. 5 concurrent users for 10 minutes), and watch for changes in response times and throughput ● You will need to get permission (KA article) to do a full load scale test at your sustained or peak loads, but doing small load tests will likely never be noticed. Performance Best Practices - Other Items

● Other best practices (also do these from Day 1): ○ Enable Event Monitoring (perferably realtime Event Monitoring) and set up dashboards to watch front-end and back-end performance ○ Single page testing - use a product like Webpagetest.org or Catchpoint.com to test page load performance ○ Page flow testing - use a product like Selenium (running in-house) or Catchpoint.com (running in-house and/or on their global nodes) to test page flows performance What Happens If I’m Super Successful?

● We have seen customers who start at lower levels, say 1000 customers a day, and the site works well, and with good performance. ○ Then one day they become super successful, hitting 100,000 customers in a day. And the site falls over. ○ What happened? What most likely happened was that the code wasn’t scale tested, and bottlenecks appeared at scale that weren’t there at lower load levels. ○ Things like: unoptimized SOQL and SOSL queries, row locks interfering with simultaneous requests, external callouts that are slower at higher loads, admin features being run during peak times But, Will My Site Scale? -- “It Depends”

● When people ask me if their site will scale I always qualify it by saying “it depends” ● You can find a list of Salesforce governors and limits in the help doc Execution Governors and Limits ● And there are also scale limits, because Salesforce is not infinitely horizontally scalable. Not at the app server level. And not at the database level. ● But what you really want to know is: Will Salesforce be able to handle my site today, tomorrow, and under peak load? But Will My Site Scale? -- OOTB vs Customized Community

● OOTB (out of the box) communities, with OOTB community components have been scale tested to the following levels: ○ 10K RPM (requests per minute) for Guest pages < 5% db cpu ○ 1K - 5K RPM for Authenticated pages < 5% db cpu (depends on which components are on the page) ● Any customer can create an OOTB community and is welcome to run their site at these levels. ● But four things impact our ability to scale beyond those levels ○ Multi-tenant model. Salesforce uses a multi-tenant model, meaning other customers are on the same hardware. We can work with Capacity Planning to make sure you are on a pod which has enough capacity for the amount of headroom which you will require. ○ Guest Access vs Authenticated Access ○ License Type ○ Custom Code But Will My Site Scale? -- Guest vs Authenticated Access

● Guest pages ○ If you have a primarily guest access community, using solely OOTB pages, it will be mostly limited by app server cpu (because we aggressively cache OOTB components). If you have permission to do so, we can support upwards of 50K RPM (800 RPS). ● Authenticated pages ○ Authenticated pages require sharing checks, and sharing checks use the database, which is why Authenticated pages do not scale like Guest pages. Authenticated pages scale pretty linearly with db cpu, so 10K RPM is possible, but that would be 50% (or more) db cpu, and you would need to get special permission to use that much db cpu. But Will My Site Scale? -- License Type

● CC User License ○ User provisioning is faster ○ Limited sharing capabilities ● CC+/PRM User License ○ User provisioning is slower ○ Additional sharing hierarchies available ○ Sharing-related background admin operations can be expensive and lead to row locks, e.g. role reparenting, account owner changes, etc. But Will My Site Scale? -- Custom Code

● Custom code is the wildcard ○ As soon as you add custom components to the mix, that's where we often see issues. If you have a custom component on a page that hits the db with an expensive SOQL, it will be a problem. Can you fix that? Yes, if you can cache the output of that SOQL in Platform Cache you can mitigate many db cpu issues. ○ Another example. Let’s say that you have a page with a cart. How many SOQLs does it take to display the cart? How much code do you run at cart Submit/Save time? How many triggers? Do the triggers cause rowlocks? How many callouts are made? You can fix some of these issues by running the code asynchronously...as long as you’re ok with eventual consistency. ○ Bottomline, the only way to ensure that custom code will scale is to run a high scale load test. But Will My Site Scale? -- Concurrency

● Lots of customers want to know how many concurrent users(*) Salesforce can handle ● Again “it depends” ● With an OOTB community, and no custom code, Salesforce can handle 1000s of concurrent users ● Where concurrency issues can occur is when: ○ a) Your code hits a governor limit such as the “max concurrent long running apex requests”. Hitting any limit can cause your site to be throttled. ○ b) Your custom code has slow or record-locking SOQL or SOSL queries.

* Concurrent users = many 100s or 1000s of users hitting your site within a short period of time, e.g. minutes or hours. It does not mean “1000s of users hitting a site at the exact same second”. Performance Case Study - Multinational Clothing Manufacturer

● Vendor interested in using Salesforce to run their B2B transactions globally ○ 40K dealers, 20K account groups, 1M users ○ 2.4M articles of clothing in catalog, 3.2M price list rows ○ 30K price lists, 12M account/price groups ○ 19 languages, 12 currencies ○ 100-400 items per order, bulk orders of up to 750K items ● Customer generated a test org with the above data shape and size ● 5 Personas: anonymous user, small dealer, larger dealer, etc. ● Goal: 7,500 orders/day, with peak of up to 1,500 users/hour ● Load tests: ○ first test at 6,000 users/hour (4x expected load) ○ a second test at 22,000 users/hour (14x expected load) ● Results: ○ ~0.3% app server CPU, ~2% db server CPU for 1,500 concurrent users ○ This was approved conditionally: the condition being that they work with capacity planning to ensure enough ongoing capacity on their production pod Performance Case Study - Multinational Sundries Vendor

● Vendor interested in using Salesforce to run their B2B transactions globally ○ Small convenience and grocery stores ○ 70 countries, 2M buyers ○ < 1000 SKUs ○ 20-40 line items per order, 1-5 orders per customer per week ○ 1M orders in year 1, rising to 50M orders in year 5 ● Expected load in year 5 ○ 330K session per day ○ 240K orders per day ○ sustained 10K orders/hour, peak of 20K orders/hour ● 3 personas: simple buyer, owner buyer, multinational staff ● Results: ○ Scale tests exceeded Db CPU guardrails ○ This was not approved, but customer was nominated for Frontier Scale program which allows very large customers to exceed normal guardrails Setting Up A Performance Testing Program

● An effective performance testing practice has the following capabilities ○ Designing test plans. ○ Selecting and adopting tools to enable the test practice. ○ Writing the tests. ○ Having logs aggregated so detailed analysis can be done on how the system is performing. ○ Being able to write queries against the system event logs. ○ Establishing key metrics and baselines. ○ Scaling up the test infrastructure to generate the required load. ○ Incorporating the above into the existing development life cycle. Establishing A Performance Testing Practice

● CSG structures a class with customers as follows: ○ Establishing a Performance Test Practice ■ With a pragmatic deep dive in the relevant Salesforce technologies, best practices, and patterns. The folks responsible for authoring the tests don't need to be Salesforce gurus, but they do need a certain level of understanding of Salesforce mechanics and ecosystem. ■ Review test results enrichment with Event Monitoring. ■ Talk through the desired testing cycle. Identify opportunities to align performance testing with the existing Hulu delivery process. ■ Identify the roles and responsibilities of who participates in the performance testing life cycle. ■ Talk through test scaling issues ○ Working through the Initial Challenges of Becoming Productive ■ Writing a test plan. ■ Identify the key metrics the test will measure (e.g. EPT, Database CPU %). ■ Identify the server side events available through Event Monitoring for our test plan. ■ Create the tests for the use case with Hulu's selected testing tools. ■ Write the required queries for the server side events. ■ Bundle up the client side results with the server side results to form a test results report. Page Level Test - Example Output ● A Catchpoint test with color-coded page load times. You can drill into any dot to see a detailed waterfall chart for that test run. Event Log - Example Output ● Logfiles imported into