<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:webfeeds="http://webfeeds.org/rss/1.0" version="2.0">
  <channel>
    <atom:link href="http://pubsubhubbub.appspot.com/" rel="hub"/>
    <atom:link href="https://f43.me/airbnb-engineering.xml" rel="self" type="application/rss+xml"/>
    <title>Airbnb Engineering</title>
    <description>Creative engineers and data scientists building a world where you can belong anywhere. http://airbnb.io - Medium</description>
    <link>http://medium.com</link>
    <webfeeds:icon>https://s2.googleusercontent.com/s2/favicons?alt=feed&amp;domain=medium.com</webfeeds:icon>
    <generator>f43.me</generator>
    <lastBuildDate>Fri, 13 Mar 2026 05:33:11 +0100</lastBuildDate>
    <item>
      <title><![CDATA[Recommending Travel Destinations to Help Users Explore]]></title>
      <description><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*GmYX3GemYdlNvs2tOLlFbQ.jpeg"></figure><p><em>How we built a destination recommendation model that helps users spark inspiration and narrow down choices to make journeys smoother.</em></p><p>By: <a href="https://www.linkedin.com/in/weiwei-guo/">Weiwei Guo</a>, <a href="https://www.linkedin.com/in/bin-xu-96253aa5/">Bin Xu</a>, <a href="https://www.linkedin.com/in/sundar-srini/">Sundara Rajan Srinivasavaradhan</a>, <a href="https://www.linkedin.com/in/tangjie81/">Jie Tang</a>, <a href="https://www.linkedin.com/in/xiaowei-liu-60415841/">Xiaowei Liu</a>, <a href="https://www.linkedin.com/in/bharathipriyaa/">Bharathi Thangamani</a>, <a href="https://www.linkedin.com/in/liweihe/">Liwei He</a>, <a href="https://www.linkedin.com/in/huiji-gao/">Huiji Gao</a>, <a href="https://www.linkedin.com/in/tracy-xiaoxi-yu/">Tracy Yu</a>, <a href="https://www.linkedin.com/in/hui-gao-275a924/">Hui Gao</a>, <a href="https://www.linkedin.com/in/stephanie-moyerman/">Stephanie Moyerman</a>, <a href="https://www.linkedin.com/in/sanjeevkatariya/">Sanjeev Katariya</a></p><p>Airbnb users in the trip planning stage may not have a clear idea of travel destinations, travel dates, or other preferences. They exhibit different behaviors compared to users who have a clear itinerary in mind. More exploratory users visit the Airbnb platform less often and are less likely to book listings in the near future; they’re more likely to search for a broad area such as “France” looking for inspiration. We believe that by helping users in the exploration stage, we can spark inspiration, reduce decision friction, and drive improvements in engagement and conversions.</p><p>In this blog post, we describe how we help users in the exploration stage by recommending travel destinations. There are multiple unique challenges in modeling destination intent: for example, how to effectively integrate diverse signals (users’ long term interests vs. short term interests), how to balance dormant user behavior vs. active user behavior, and how to encode rich geolocation knowledge.</p><p>To address these challenges, we developed a framework that predicts users’ destination intent based on their actions on the Airbnb platform. While the framework is inspired by language modeling, we introduce several key adaptations in training data creation, model architecture, and loss function to tailor it to the destination recommendation problem in the travel domain. Lastly, we present two applications, autosuggest and abandoned search email notifications, that help users explore destination possibilities and facilitate booking decisions.</p><h3>Model architecture</h3><p>Travel destination is one of the primary aspects users explore during trip planning, as it largely determines subsequent decisions such as travel timing, budget, and accommodation preferences. User travel destination preferences are driven by a combination of historical behavior, contextual signals, and temporal factors, etc. For instance, users who previously booked listings in Hawaii may exhibit a preference for beach or tropical destinations, while seasonal context (e.g., summer) may shift their intent toward cooler locations.</p><p>In our model, we generalize the destination prediction based on historical user preference data. (Users are able to opt out of this personalization.) As shown in Figure 1, we treat each user action as a token, inspired by language modeling. We use transformers to model sequences of user actions as recorded in various sources: booking history, view history, and search history. Each action is represented by the sum of embeddings of city / region / days to today. We also use contextual information, such as the current time, to capture seasonality. This setup enables the model to summarize user’s short-term interests (views, searches), and long-term interests (bookings), and make a holistic prediction of destination intent.</p><p>Figure 1: model architecture.</p><h3>Balancing active users and dormant users</h3><p>At Airbnb, we need to make predictions not only for “active users,” but also for “dormant users”. They exhibit different behaviors, for example:</p><ul><li><strong>Active users: </strong>User A recently issued a search in the California Bay Area last week. She is currently looking for more affordable listings in the Bay Area.</li><li><strong>Dormant users:</strong> User B made several bookings in 2025, and hasn’t returned to Airbnb since then. He is currently exploring ideas for a summer vacation in 2026.</li></ul><p>Motivated by these two different types of goals, we design the training data shown in Figure 2. For each booking, we create 14 training examples in total. There are two parts:</p><ol><li>Seven training examples for active users, from 1, 2, 3…7 days before the booking date. For these 7 examples, we use the up-to-date booking/view/search data. This is to mimic the late booking stage when users have a rough idea where to go.</li><li>Seven training examples for dormant users, randomly sampled from 8 to 365 days before the booking. For these 7 examples, we only use booking data, to mimic the early planning stage when users don’t have a concrete idea and haven’t come to Airbnb.</li></ol><p>Figure 2: T is the date for the latest booking. The arrows at the bottom show the training examples used for the planning stage; the arrows in the upper-right corner illustrate the training examples used for the booking stage.</p><h3>Improving location understanding</h3><p>At Airbnb, we have rich geolocation information about cities and their relationships. For example, the California Bay Area contains many closely related cities; a user interested in staying in San Francisco may also consider nearby cities such as San Jose. For the purposes of destination recommendation, the Bay Area can be viewed as a broader “region” that encompasses multiple cities.</p><p>To incorporate this information into our framework, we use multi-task learning. Specifically, we add multiple prediction heads at the final layer of the model, each corresponding to a different prediction task. As shown in Figure 1, the model is trained to predict both the region-level and the city-level destination. By jointly learning these tasks and encouraging consistency between region and city predictions, the model learns richer geolocation representations of cities.</p><h3>Applications</h3><p>We deployed the resulting model in two features of the Airbnb platform. The first is autosuggest. When users click on the search bar, multiple city recommendations are presented. Online A/B testing shows significant booking gains in regions where English is not the primary language; further analysis indicates that these recommendations benefit not only users who have not yet decided on a destination, but also users who are open to booking more affordable listings in neighboring cities.</p><p>The second application is abandoned search email notifications. When a user abandons a search on Airbnb, we send follow-up emails featuring listings from areas predicted by the destination recommendation model. This helps drive bookings by encouraging users to explore alternative listings within the recommended destinations and re-engaging them to complete a booking on Airbnb.</p><h3>Conclusion</h3><p>In this post, we described a destination recommendation framework designed to support users in the exploration stage of trip planning, when intent is often ambiguous and preferences are still forming. Our framework includes several key innovations: modeling multiple sequences of user actions to balance short-term and long-term interests, designing training data to accommodate both active and dormant user behaviors, and using multi-task learning to incorporate rich geolocation information. Deployed in autosuggest and abandoned search email notifications, the model helps users discover relevant destination alternatives and drives measurable booking gains. Looking ahead, this framework provides a solid foundation for modeling other preferences, such as travel times and price preferences, enabling broader and deeper personalization across the travel planning journey.</p><p>If this type of work interests you, check out some of our <a href="https://careers.airbnb.com/">open roles</a>.</p><h3>Acknowledgments</h3><p>We would like to especially thank the following people for their great collaboration throughout this project: Kidai Kwon, Phanindra Ganti, Kedar Bellare, Malay Haldar, Soumyadip Banerjee, Michael Kinoti, Yi Li, Amisha Patel, Rachel Zhao, Zhentao Sun, Wei Jiang, Jackie Liu, Ying Xiao, Hongzhao Huang, Chen Qian, Haiyang Han, Pengyu Hou, Haichun Chen, Sherry Chen, Pavan Tapadia, Stephen Simburg, Clarence Quah, Chris Tarello, Eric Kostenbauder, Linda Yu, Gary Chang.</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=5fa7a81654fb" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/recommending-travel-destinations-to-help-users-explore-5fa7a81654fb">Recommending Travel Destinations to Help Users Explore</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/recommending-travel-destinations-to-help-users-explore-5fa7a81654fb</link>
      <guid>https://medium.com/airbnb-engineering/recommending-travel-destinations-to-help-users-explore-5fa7a81654fb</guid>
      <pubDate>Thu, 12 Mar 2026 19:55:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[It Wasn’t a Culture Problem: Upleveling Alert Development at Airbnb]]></title>
      <description><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Cy9Y4YIsQLRhjl15rMGqkQ.jpeg"></figure><h4>How we changed our Observability as Code alert review process and cut development cycles from weeks to minutes.</h4><p>Observability as Code (OaC) — defining alerts, dashboards, and SLOs via code rather than UI — is table stakes for large engineering organizations. With OaC, observability adopts software development’s version control, code review, and testing processes, achieving the same level of discipline as a result. At Airbnb’s scale (thousands of engineers and services), this is the foundation that lets teams ship confidently while maintaining the reliability our guests and hosts depend on.</p><p>Yet there’s a critical gap in most OaC workflows. While we bring rigor to alert definitions through code review and version control, the actual behavior of those alerts often can’t be validated until they’re live. Production becomes the proving ground. Problems surface either as noise that erodes trust or silence that hides real incidents.</p><p>This tolerance of high alert noise might appear to be a culture problem, but we realized it was actually a gap in the developer workflow. We solved it by building accessible, fast feedback loops to preview, validate, and surface actionable insights on alert behavior before PR submission. With these changes, development cycles collapsed from weeks to minutes, and we successfully migrated 300,000 alerts from a vendor to Prometheus, a feat that wouldn’t have been possible otherwise.</p><h3>Airbnb’s OaC North Star</h3><p>Our Observability as Code North Star is for product teams to receive out-of-the-box, best-practice monitoring from platform teams. When a product engineer adopts Kubernetes, a service framework, or a database, they should inherit battle-tested alerts, dashboards, and SLOs. The best monitoring gives product engineers the benefit of all of Airbnb’s infra and platform domain expertise immediately. We call this “zero touch.”</p><p>We began our OaC journey more than 10 years ago when we <a href="https://medium.com/airbnb-engineering/alerting-framework-at-airbnb-35ba48df894f">built Interferon</a>, starting with 1,000 alerts. Today, we manage 300,000. By any measure, Interferon was a success, scaling our monitoring practices towards our North Star.</p><p>However, this success introduced new operational challenges. With so many alerts in production, validating any OaC changes became costly, and it became harder to iterate. Engineers faced a difficult tradeoff: tuning an alert template might reduce noise, but it also risked losing an important signal. Without a way to preview alert behavior, the safer choice was often to leave things as-is.</p><h3>The problem: Traditional code review can’t validate alert behavior</h3><p>The problem wasn’t Interferon itself, but rather a development workflow gap. Our North Star requires platform teams to define and maintain monitoring patterns at scale, but we lacked the necessary tooling to effectively validate those patterns against reality.</p><p>Traditional code review can validate syntax and logic, and unit tests can verify outputs. But neither can answer the questions that matter most: “How will these alerts behave in production? What noise might they generate? Will they needlessly wake up on-call engineers at 3 AM?”</p><p>That’s why you have to validate alerts against real-world data. If your assumptions about the data are wrong, your alert is wrong.</p><p>However, off-the-shelf query visualization tools are insufficient for the job. They don’t account for alert-specific parameters, and most notably, they show downsampled data that masks actual alert behavior — step sizes don’t match evaluation intervals. Also, the further you move from static configs, the harder validation becomes. With templating, reviewers must manually copy-paste queries and fill in variables. For a single alert, this is tedious and error-prone; for changes affecting hundreds of services, it’s impossible.</p><p>So in practice, developers often had to resort to a weeks-long process of deploying a new alert side-by-side with the existing one, waiting for real-world data to come in, validating, and then iterating.</p><h3>The solution: Making alert behavior visible</h3><p>What if an infrastructure engineer could validate a Kubernetes alert template, one that fans out to thousands of services, in 30 seconds instead of 30 days?</p><p>That question among others prompted us to rethink and rebuild our OaC platform from the ground up. Building on top of Prometheus’ open source foundations, we could develop the exact UX our engineers needed, particularly local diffs and pre-deployment validation.</p><p>The core of this workflow is local-first development: The same code and inputs run in production must run identically on a developer’s laptop and in CI. In addition, we built <strong>Change Reports</strong> that show how alerts will be modified and <strong>bulk backtesting</strong> that simulates alerts against historical data.</p><p>We rolled out this platform incrementally, with each milestone providing compounding value.</p><h4>Phase 1: Text-based diffs everywhere</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*NEVDYuTGZJ6XwAsCUqetHA.png"><figcaption>An example alert diff in markdown format</figcaption></figure><p>We first generated markdown alert diffs with field-level granularity and query links — the “terraform plan” of OaC. We met developers where they work, in terminal via CLI and in PRs via CI. This solved the basic visibility problem: engineers could finally review the OaC generated alerts without error-prone copy-pasting.</p><h4>Phase 2: Dedicated Change Report UI</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Z2FoWr95uknzOPb3tYJwwg.png"></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*AQerk_Q3E1h3zOveLijmRg.png"><figcaption>The left image lists all alert modifications resulting from a code change. The right image dives into a specific alert.</figcaption></figure><p>We then built a Change Report UI showing side-by-side alert diffs exactly as they will appear in production, removing the guesswork and mental mapping between config and UI. However, the user was still responsible for mentally simulating alert behavior, which is challenging even for Prometheus experts to get right.</p><h4>Phase 3: Historical simulation via bulk backtesting</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*R4Gk9LyATkdH5hHHVros9g.png"></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*BA_Fv1oVTPymUQUwene7HQ.png"><figcaption>The backtesting integration into the change list (left) and individual alert (right) views</figcaption></figure><p>Finally, we built a backtesting system that runs proposed alerts against historical data, hooking directly into <a href="https://github.com/prometheus/prometheus/blob/c7bc56cf6c8f9c92e98beddca26ed9b47f8a5ac9/rules/manager.go#L97">Prometheus’s rule manager</a>. Backtesting allows users to understand which alerts would have fired, when, and why, as if they had existed the entire time. Displaying this simulated state inline in the Change Report UI answered the question that matters most: “How would this alert behave in production?”</p><p>We backtest in bulk for the entire diff — hundreds or thousands of related alerts — and surface quality signals to help reviewers focus their attention. We compute a “noisiness” metric and show alert firing timelines in the table view, letting users sort by potential problems and focus their effort.</p><h4>Putting it all together</h4><p>On the new platform, when a user makes an OaC change, they will now generate a Change Report via CLI or CI. We post a Change Report on all PRs.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ibWolnBsOQD4yJCDk9zdag.png"></figure><p>The user reviews their changes via the UI. In this example, a one week backtest was conducted, and the changes are sorted by noisiness, seen in the “Tuning” column, to help direct users’ attention.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*U6n6RvF4-fTi3MBVeQsjtA.png"></figure><p>The user can dive into individual alerts to learn more. This one looks to be problematic, firing once per day.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OU3knDx3DNoQGhVQHZPILA.png"></figure><p>The user can set overrides. Given the graph, 1.14 looks like a better trigger threshold.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*hSNS3BDZ2sdJdHakuEb-1A.png"></figure><p>They can then see the impact of their changes. No more alert firings. This looks good and is ready to ship.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Vw2paibsDpqT2Cqe4MSK6A.png"></figure><h3>Our learnings</h3><h4>Compatibility over novelty</h4><p>A key architectural decision was compatibility over novelty<strong>.</strong> Rule groups are Prometheus’ standardized format. By taking that as input, we hooked directly into Prometheus’s rule evaluation engine rather than reimplementing it. We wrote results as Prometheus time series blocks, exposing the data via <a href="https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries">the standard query API</a>. This meant building analysis tools once. This standardization made our system portable, allowing us to reach all developers in their existing workflows.</p><h4>Guardrails aren’t optional</h4><p>Simulating thousands of alerts over 30 days quickly and without service degradations required careful design. Each backtest runs in its own Kubernetes pod with autoscaling to prevent resource contention. Concurrency limits, error thresholds, and multiple circuit breakers prevent cascading failures. A backtesting system that can destabilize production is worse than no system at all.</p><h4>Perfect is the enemy of shipped</h4><p>Our simulator doesn’t account for recording rule dependencies. We could have built a more sophisticated dependency resolver, but users can separate this into two distinct tasks — modify the recording rule first, then backtest alerts that depend on it — assuming they know they should. The Change Report UI helps, because when modified dependencies are detected, it highlights them and prompts resolution. This turns a technical limitation into a guided workflow. We shipped the 80% solution that immediately delivers value, leveraging the UI to close gaps.</p><h4>Own the full surface area</h4><p>Monitoring is often an afterthought — engineers have limited time under tight deadlines. Our job is to make that time as effective as possible. Prometheus is powerful but exposes a low-level API that requires expertise most engineers don’t have. To achieve our North Star, we introduced abstractions like anomaly detection, burn rate alerts, and change detection. But abstractions only simplify things when you own all the touchpoints: the input language engineers write, the generation process, the UI that displays results, and the validation tools that provide feedback. Partial ownership creates leaky abstractions. Full ownership lets us ruthlessly optimize for developer experience.</p><h3>The impact</h3><h4>Successful migration from a vendor to Prometheus</h4><p>We migrated 300,000 alerts from a vendor to Prometheus. Rewriting every alert would have been impossible with our old workflow, but we achieved it thanks to our Change Report UI, bulk backtesting, and an additional vendor-specific integration. By codifying our domain knowledge in the UX, what originally promised to be a multi-year slog of manual effort became a structured, confident migration.</p><h4>Collapsed development cycles</h4><p>The typical developer workaround for making alert changes — deploy side-by-side, wait, then iterate — became obsolete. Engineers now make and validate alert changes all within a single PR. Platform teams confidently deploy template changes affecting thousands of services. What once took a month of iteration now takes an afternoon.</p><h4>Culture transformation</h4><p>Even though we realized we had a workflow problem, not a culture problem, solving this problem still ended up transforming our culture. We reduced companywide alert noise by 90%, and engineers stopped tolerating noisy alerts and started competing to improve them. Platform teams resumed iterating on shared patterns. Alert hygiene became a point of pride, not a chore to avoid.</p><blockquote>Turnkey alert testing is the biggest positive improvement to alerts management in Airbnb’s history.</blockquote><blockquote>I’ve seen people spend hours debating the merits of changing an alert only to do nothing because of fear, uncertainty, and doubt. This new alert testing capability completely evaporates the stop energy and allows us to monitor with confidence.</blockquote><blockquote>- Gregory Szorc, Senior Staff Software Engineer</blockquote><h3>Conclusion</h3><p>Local-first development, Change Reports, and bulk backtesting give us the necessary tools to incrementally reach our North Star. Platform teams can now confidently iterate on monitoring for their domains. Zero touch is becoming how we operate, one cycle at a time.</p><p>Now that we’ve introduced pre-deployment visibility and validation to the alert lifecycle, our next step is to introduce that same rigor to on-call analysis.</p><p>If this type of work interests you, check out some of our <a href="https://careers.airbnb.com/">open roles</a>.</p><h4>Acknowledgements</h4><p>Thank you to the Reliability Experience team — Kevin Goodier, Harry Shoff, Rich Unger, and Vlad Vassiliouk — and our partners across the company who helped make this a reality.</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=01e2290eb0f5" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/it-wasnt-a-culture-problem-upleveling-alert-development-at-airbnb-01e2290eb0f5">It Wasn’t a Culture Problem: Upleveling Alert Development at Airbnb</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/it-wasnt-a-culture-problem-upleveling-alert-development-at-airbnb-01e2290eb0f5</link>
      <guid>https://medium.com/airbnb-engineering/it-wasnt-a-culture-problem-upleveling-alert-development-at-airbnb-01e2290eb0f5</guid>
      <pubDate>Wed, 04 Mar 2026 19:01:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Academic Publications & Airbnb Tech: 2025 Year in Review]]></title>
      <description><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*FfdFpfjGZODh7ltZ2W2Mrg.jpeg"></figure><p>2025 was a big year for research at Airbnb, as we made significant progress toward our mission to use AI, data science, and machine learning to become the best travel and living platform.</p><p>Specifically, we doubled down on our presence at long-standing venues like KDD and CIKM — two of the most selective conferences in machine learning. At the same time, we expanded our research footprint by sharing our work in NLP, optimization, and measurement science at conferences such as COLING, LION, and VLDB.</p><p>Across these conferences, Airbnb researchers engaged directly with academic and industry peers by publishing and presenting papers, learning about the latest innovations, launching new collaborations, and mentoring emerging researchers. In this blog post, we’ll recap the conferences and key papers we presented in 2025, organized by research themes.</p><h3>Applied machine learning for search, ranking, and personalization</h3><h3>KDD (Knowledge and Data Mining)</h3><p><a href="https://kdd2025.kdd.org/"><em>KDD</em></a><em> is a flagship conference in data science research. Hosted annually by a special interest group of the Association for Computing Machinery (ACM), it’s where researchers learn about some of the most groundbreaking developments in data mining, knowledge discovery, and large-scale data analytics, which are critical to Airbnb’s efforts to improve core products like search and recommendations.</em></p><p><strong>Our participation</strong></p><p>We’ve been presenting at KDD since 2018, and 2025 was another strong year for us. We received multiple contributions across the applied data science track and workshops, which were well-received by the broader community and even inspired us to consider open-sourcing some of our technology. We were also inspired by the related research in this area and are eager to explore these methods through new collaborations.</p><p><strong>Research highlights</strong></p><ol><li><a href="https://arxiv.org/abs/2508.00751">Harnessing the Power of Interleaving and Counterfactual Evaluation for Airbnb Search Ranking</a>: While A/B tests are crucial for developing ranking algorithms and recommender systems, they’re difficult to set up and can take extensive time to reach statistical significance (especially for products with long conversion cycles, like accommodation booking). In this paper, we shared techniques for rapid pre-A/B online assessments that help teams identify the most promising experiments, streamlining the overall process without sacrificing accuracy.</li><li><a href="https://drive.google.com/file/d/1Zeqk6aKXsCEFevB_jAbw4AXzxJ30uKwR/view">High Precision Audience Expansion via Extreme Classification in a Two-Sided Marketplace</a>: Airbnb search balances diverse global inventory with varied guest preferences for location, amenities, style, and price. This process requires efficient location retrieval to find the listings guests might realistically book by determining which geographic areas to query. We introduce a new approach to location retrieval by using a set of relevant, high-precision categorical location cells.</li></ol><p><strong>Link to all papers</strong></p><ul><li><a href="https://dl.acm.org/doi/10.1145/3711896.3737232">Harnessing the Power of Interleaving and Counterfactual Evaluation for Airbnb Search Ranking</a> (Qing Zhang, Alex Deng, Michelle Du, Huiji Gao, Liwei He, Sanjeev Katariya)</li><li><a href="https://drive.google.com/file/d/1Zeqk6aKXsCEFevB_jAbw4AXzxJ30uKwR/view">High Precision Audience Expansion via Extreme Classification in a Two-Sided Marketplace</a> (Dillon Davis, Huiji Gao, Thomas Legrand, Juan Manuel Caicedo Carvajal, Malay Haldar, Kedar Bellare, Moutupsi Paul, Soumyadip Banerjee, Liwei He, Stephanie Moyerman, and Sanjeev Katariya)</li><li><a href="https://sites.google.com/view/tsmo2025/home">TSMO: Two-sided Marketplace Optimization</a></li></ul><h3>CIKM (Conference on Information and Knowledge Management)</h3><p><a href="https://cikm2025.org/"><em>CIKM</em></a><em> is a premier forum for discussing and presenting research at the intersection of information and knowledge management, including topics like AI, data mining, database systems, and information retrieval. Many of these topics directly intersect with our core product challenges, such as search, ranking, and recommendations.</em></p><p><strong>Our participation</strong></p><p>At CIKM 2025, Airbnb’s Relevance and Personalization team had <a href="https://sites.google.com/view/airbnb-relevance-publications/home?authuser=0">five peer-reviewed papers accepted for publication</a>, building on our participation in 2023 and 2024. These papers focused on advanced AI/ML techniques for search and recommendations, and sharing real-world insights from using these technologies at Airbnb’s scale. Industry and academic researchers, especially those working on two-sided marketplaces, engaged with our work and provided valuable feedback.</p><p><strong>Research highlights</strong></p><ol><li><a href="https://dl.acm.org/doi/10.1145/3746252.3761526">Augmenting Guest Search Results with Recommendations at Airbnb</a>: When guests use overly narrow criteria to search for accommodations, they often receive insufficient results, leading to a frustrating experience. This paper introduces a recommendation system that dynamically suggests alternatives — different dates, relaxed amenities, or adjusted price ranges — to help guests find suitable accommodations and improve the platform’s booking rate. <em>Authors: Haowei Zhang, Philbert Lin, Dishant Ailawadi, Soumyadip Banerjee, Shashank Dabriwal, Hao Li, Kedar Bellare, Liwei He, Sanjeev Katariya</em></li><li><a href="https://dl.acm.org/doi/10.1145/3746252.3761563">Maps Ranking Optimization in Airbnb</a>: Maps play a crucial role in Airbnb search and bookings, accounting for roughly 80% of search interactions. Yet map ranking has traditionally reused feed-ranking assumptions, which break down when we examine the NDCG (Normalized Discounted Cumulative Gain) metric. This paper explains why list-based NDCG fails to model user attention on maps, introduces a map-specific NDCG, and reports experiments showing that optimizing it yields booking gains. <em>Authors: Hongwei Zhang, Malay Haldar, Kedar Bellare, Sherry Chen, Soumyadip Banerjee, Xiaotang Wang, Mustafa Abdool, Huiji Gao, Pavan Tapadia, Liwei He, Sanjeev Katariya, Stephanie Moyerman</em></li><li><a href="https://dl.acm.org/doi/10.1145/3746252.3761577">BListing: Modality Alignment for Listings</a>: To improve search ranking, we introduce BiListing (Bimodal Listing) embeddings to use unstructured text and photo listing data as ranking signals. BiListing leverages large-language models and pretrained language-image models to create unified representations of diverse unstructured data into a single embedding vector per list and modality. Our experiment results show a 0.425% increase in NDCB (Normalized Discounted Cumulative Booking) gain and drove tens of millions in incremental revenue. <em>Authors: Guillaume Guy, Mihajlo Grbovic, Chun How Tan, Han Zhao</em></li><li><a href="https://dl.acm.org/doi/10.1145/3746252.3761521">Beyond Pairwise Learning-To-Rank At Airbnb</a>: In this paper, we introduce a method to improve the accuracy of pairwise learning-to-rank algorithms, the bedrock of modern search stacks. This approach captures interactions between items during pairwise comparisons, thereby giving us a better sense of what searchers truly want. We also share ways to implement this algorithm performantly, and results from online and offline experiments. <em>Authors: Malay Haldar, Daochen Zha, Huiji Gao, Liwei He, Sanjeev Katariya</em></li><li><a href="https://dl.acm.org/doi/10.1145/3746252.3761567">Learning to Comparison-Shop</a>: Traditional ranking models often evaluate items in isolation, disregarding the context in which users compare multiple items on a search results page. In this paper, we propose a novel ranking architecture, the Learning-to-Comparison-Shop (LTCS) System, that explicitly models and learns users’ comparison-shopping behaviors. Our experiments show statistically significant improvements of 1.7% in Normalized Discounted Cumulative Gain (NDCG) and 0.6% in booking conversion rate. <em>Authors: Jie Tang, Daochen Zha, Xin Liu, Huiji Gao, Liwei He, Stephanie Moyerman, Sanjeev Katariya</em></li></ol><h3>NLP &amp; building LLM systems in production</h3><h3>EMNLP (Empirical Methods in Natural Language Processing)</h3><p><a href="https://2025.emnlp.org/"><em>EMNLP</em></a><em> is a top-tier NLP conference that brings together practitioners and researchers to discuss new architectures and training strategies for language models, safety and evaluation strategies for LLMs, and real-world NLP applications. These research areas directly intersect with many of Airbnb’s product surfaces, such as customer support, search &amp; discovery, and trust &amp; safety. Additionally, each EMNLP cycle includes the release of new datasets, evaluation suites, and open-source libraries to help teams benchmark their progress against community standards.</em></p><p><strong>Our participation</strong></p><p>In 2025, we sponsored EMNLP and presented two papers on humans-in-the-loop in AI systems and advanced summarization techniques. We also used EMNLP’s community datasets to benchmark our system, which showcased where we excel and where we can build upon our success with additional best practices. The conference deepened academic collaborations through discussions on LLM evaluation, safety, and agentic AI design, including mentoring students and early-career researchers.</p><p><strong>Research highlights</strong></p><ol><li><a href="https://arxiv.org/abs/2510.06674">Agent-in-the-Loop, A Data Flywheel for Continuous Improvement in LLM-based Customer Support</a>: To improve our LLM-based customer support system, this paper introduces an Agent-in-the-Loop (AITL) framework that leverages new interaction data to continuously enhance model performance. This flywheel can help the system stay up to date with new product features, shifting user preferences, and updated support policies and procedures. We launched a pilot in the US, and the results demonstrate significant improvement in accuracy and helpfulness. <em>Authors: Cen Mia Zhao, Tiantian Zhang, Hanchen Su, Yufeng Wayne Zhang, Shaowei Su, Mingzhi Xu, Yu Elaine Liu, Wei Han, Jeremy Werner, Claire Na Cheng, Yashar Mehdad</em></li><li><a href="https://arxiv.org/abs/2510.06677">Incremental Summarization for Customer Support via Progressive Note-Taking and Agent Feedback</a>: Customer service agents multitask during support interactions, identifying core issues, tracking prior actions, and producing accurate notes. To streamline this workflow, we introduced an incremental summarization system that intelligently determines when to generate concise bullet notes during conversations, reducing agents’ context-switching effort without sacrificing quality. To improve the system over time, we also introduced a learning framework that enables agents to make real-time edits, immediately refining online note generation. <em>Authors: Yisha Wu, Cen Mia Zhao, Yuanpei Cao, Xiaoqing Su, Yashar Mehdad, Mindy Ji, Claire Na Cheng</em></li></ol><h3>COLING (International Conference on Computational Linguistics)</h3><p><a href="https://coling2025.org/"><em>COLING</em></a><em> is a top-tier NLP conference that covers both foundational research and industry applications of language models, including reasoning, evaluation, multilingual NLP, and real-world LLM systems. The work presented at this conference helps validate Airbnb’s technical direction and directly informs future investments.</em></p><p><strong>Our participation</strong></p><p>In 2025, Airbnb presented at COLING for the first time, sharing a paper titled “<a href="https://arxiv.org/abs/2510.10331">LLM-Friendly Knowledge Representation for Customer Support</a>” by Hanchen Su, Wei Luo, Wei Han, Yu Elaine Liu, Yufeng Wayne Zhang, Cen Mia Zhao, Ying Joy Zhang, and Yashar Mehdad. The paper presents a new format, Intent, Context, and Action (ICA), for structuring business knowledge in LLM-based QA and customer support workflows. Initial experiments in production show promising results. We also discovered relevant research in knowledge retrieval, LLM evaluation, and hallucination detection that will inspire future projects.</p><h3>Optimization, causal inference, and measurement science</h3><h3>MIT CODE (Conference on Digital Experimentation)</h3><p><a href="https://ide.mit.edu/events/2025-conference-on-digital-experimentation-mit-codemit/"><em>MIT CODE</em></a><em> is one of the premier venues for researchers and practitioners to discuss topics in online digital experimentation, causal inference, and data-driven product innovation. The conference supports our commitment to data-driven decision-making and using experimentation to understand the long-term impacts on guests, hosts, and marketplace health.</em></p><p><strong>Our participation</strong></p><p>In 2025, we had another strong showing at CODE, with a cohort of 6 data scientists and 3 academic collaborators. We gave talks in two sessions and presented a poster, which led to meaningful discussions with peer companies and interest in collaborating with academic research groups.</p><p><strong>Research highlights</strong></p><ol><li><a href="https://airbnb.tech/wp-content/uploads/sites/19/2026/01/BSTAR_abstract_5pages.pdf">Beyond the Experiment Window: Prospective Impacts Under Long-Term Ranking Dynamics</a>: Product teams frequently leverage A/B tests to assess different rankers. While these experiments are typically conducted over shorter periods, we also recognize the value of understanding longer-term dynamics (such as seasonality and user evolution) to further support sustained business objectives, like marketplace health. To solve this problem, we developed a causal framework that allows us to estimate the long-term impacts of ranking changes with strategic goals (like marketplace health) using A/B testing data.</li><li><a href="https://drive.google.com/file/d/1y_3sYhpydOKd8CmOGNT-2hPyJR2mb-QW/view?usp=sharing">Trustworthy Bayesian Inference in Batch-Adaptive Experimentation</a>: Adaptive experimentation, like multi-arm bandit methods, can improve experiment efficiency by reallocating traffic toward promising treatments. Continued advancements in these approaches are expanding our ability to maintain high standards of statistical validity. This paper introduces a practical Bayesian framework for inference in batch-adaptive experiments, specifically tailored to the operational realities of online platforms.</li></ol><p><strong>Link to all papers</strong></p><ul><li><a href="https://drive.google.com/file/d/13W6ihExC-qyFZmlciO2d7kM88LnNIoUp/view">Beyond the Experiment Window: Prospective Impacts Under Long-Term Ranking Dynamics</a> (Lo-Hua Yuan)</li><li><a href="https://drive.google.com/file/d/1y_3sYhpydOKd8CmOGNT-2hPyJR2mb-QW/view?usp=sharing">Trustworthy Bayesian Inference in Batch-Adaptive Experimentation</a> (Yicheng Li)</li><li><a href="https://drive.google.com/file/d/1TixwWkoHn_H4Lyrvml5h3XrdWxlGVC29/view?usp=sharing">Experimental Design for Product Launches with Collaborative User Networks</a> (Monu Kala)</li></ul><h3>INFORMS (Institute for Operations Research and the Management Sciences)</h3><p><a href="https://meetings.informs.org/wordpress/annual/"><em>INFORMS</em></a><em> brings together academics and industry professionals to discuss and share research across data science, machine learning, economics, behavioral science, and analytics.</em></p><p><strong>Our participation</strong></p><p>In 2025, our data science team was invited to INFORMS to present two talks in a session about bridging the gap between statistical methods and industry applications.</p><p><strong>Research highlights</strong></p><ol><li><a href="https://docs.google.com/presentation/d/1RkcTBkhqg1xNeQ-boeXVvgZK9JIUg37Zo590Eo67zgc/edit?slide=id.gc91b80c485_0_550#slide=id.gc91b80c485_0_550">Beyond Multi-Arm Bandits: Tackling Challenges in Adaptive Experiments at Airbnb</a>. In this talk, we walked through the metrics and infrastructure challenges when using classic bandit algorithms, which make it difficult to operationalize adaptive experiments. We propose a hybrid approach that incorporates bandit algorithms into A/B experiments to enable adaptive testing. We also discussed how we onboard and validate adaptive experiments across individual product domains at Airbnb.</li></ol><p><strong>Link to all papers</strong></p><ul><li><a href="https://drive.google.com/file/d/13W6ihExC-qyFZmlciO2d7kM88LnNIoUp/view">Beyond the Experiment Window: Prospective Impacts Under Long-Term Ranking Dynamics</a> (Lo-Hua Yuan)</li><li><a href="https://docs.google.com/presentation/d/1RkcTBkhqg1xNeQ-boeXVvgZK9JIUg37Zo590Eo67zgc/edit?slide=id.gc91b80c485_0_550#slide=id.gc91b80c485_0_550">Beyond Multi-Arm Bandits: Tackling Challenges in Adaptive Experiments at Airbnb</a> (Yicheng Li)</li></ul><h3>LION (Learning and Intelligent Optimization)</h3><p><em>The </em><a href="https://lion19.org/"><em>LION</em></a><em> conference is a premier gathering of researchers exploring the intersection of machine learning, artificial intelligence, and mathematical optimization.</em></p><p><strong>Our participation</strong></p><p>While Airbnb has attended LION in the past, 2025 was the first time we presented at the conference. Nathan Brixius presented “<a href="https://airbnb.tech/wp-content/uploads/sites/19/2025/10/brixius_block.pdf">Optimal Matched Block Design For Multi-Arm</a></p><p><a href="https://airbnb.tech/wp-content/uploads/sites/19/2025/10/brixius_block.pdf">Experiments</a>,” which introduces a new optimization formula using mixed-integer programming (MIP) to group subjects in multi-armed experiments, leading to more balanced groups and, in turn, more accurate experimental results. We also connected with leading experts in metaheuristics and AI fairness to help shape our future roadmap and sponsored the awards for the best papers presented at the conference.</p><h3>Data systems</h3><h3>VLDB (Very Large Data Bases)</h3><p><em>The </em><a href="https://vldb.org/2025/"><em>VLDB</em></a><em> Conference is one of the top 2 flagship conferences in data management and large-scale data systems, with over 1,500 researchers and practitioners attending.</em></p><p><strong>Our participation</strong></p><p>“In 2025, we published our first paper at VLDB: ‘<a href="https://www.vldb.org/pvldb/vol18/p5210-lightstone.pdf">SQL:Trek Automated Index Design at Airbnb</a>’ by Sam Lightstone and Ping Wang. The paper presents a novel approach for automated index design (code-named SQL:Trek). It uses query compiler cost models to identify effective indexes across many relational databases, including most MySQL and PostgreSQL derivatives. Additionally, the Airbnb team attended sessions on system efficiency, graph computing, and AI databases, and had the opportunity to meet other researchers.</p><h3>Conclusion</h3><p>Conferences remain a big part of our research program at Airbnb, helping us validate and refine our ideas through community feedback and providing a forum to share real-world insights that advance the field. In 2025, we doubled down on this vision by publishing papers for the first time at conferences in domains such as NLP, optimization, causal inference, and data systems, reflecting our ongoing commitment to using these technologies to create the best possible travel experiences.</p><p>As we look to 2026, we’re eager to expand our presence at these conferences and discover new ways to use AI, machine learning, and data science to build a best-in-class travel and living platform. If you’re interested in doing this type of work with us, consider joining us. <a href="https://careers.airbnb.com/">Apply for one of our open positions</a>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=7d79f57d3b52" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/academic-publications-airbnb-tech-2025-year-in-review-7d79f57d3b52">Academic Publications &amp; Airbnb Tech: 2025 Year in Review</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/academic-publications-airbnb-tech-2025-year-in-review-7d79f57d3b52</link>
      <guid>https://medium.com/airbnb-engineering/academic-publications-airbnb-tech-2025-year-in-review-7d79f57d3b52</guid>
      <pubDate>Tue, 24 Feb 2026 19:36:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Safeguarding Dynamic Configuration Changes at Scale]]></title>
      <description><![CDATA[<h4><strong>How Airbnb ships dynamic config changes safely and reliably</strong></h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*v2VDklQn1NHAx-MiR8DYMQ.png"></figure><p>By <a href="https://www.linkedin.com/in/cosmo-qiu/">Cosmo Qiu</a>, <a href="https://www.linkedin.com/in/bo-t-b04912238/">Bo Teng</a>, <a href="https://www.linkedin.com/in/siyuan-zhou-85ba8057/">Siyuan Zhou</a>, <a href="https://www.linkedin.com/in/ankursoni/">Ankur Soni</a>, <a href="https://www.linkedin.com/in/willish/">Willis Harvey</a></p><p>Dynamic configuration is a core infrastructure capability in modern systems. It allows developers to change runtime behavior without restarting or redeploying services, even as the number of services and requests grows. In practice, that might mean rolling out a new address form for a region launch, tightening an authorization rule, or adjusting timeouts when a dependency is slow.</p><p>Like any powerful tool, dynamic configuration is a double-edged sword. While it enables fast iteration and rapid incident response, a bad change can cause regressions or even outages. This is a common challenge across the industry: balancing developer flexibility with system reliability.</p><p>In this post, we will outline the expectations of a modern dynamic configuration platform, then walk through the high-level architecture of Airbnb’s dynamic config platform and how its core components work together to enable safe, flexible config changes.</p><h3>Modern config platform essentials</h3><p>As Airbnb’s business grows, our expectations for the dynamic config platform have evolved over time through our own learnings as well as industry best practices. These shape our view of what a good dynamic config platform should provide, including:</p><ul><li><strong>A coherent experience for config management</strong>: The platform provides a streamlined, end-to-end experience for defining, reviewing, testing, and rolling out config changes. It covers the most common needs out of the box with rich built-in features, while still offering escape hatches for edge cases.</li><li><strong>Strong reliability, availability and safety guarantees</strong>: All config changes are validated, reviewed, and rolled out progressively, with clear ownership and well-defined access control. Treating config as code is a key focus: config changes are versioned, reviewed, and auditable like service code, but remain dynamic at runtime. The platform itself must be highly available so that services can reliably fetch and apply configs. Changes should be observable, with support for fast rollbacks when needed.</li><li><strong>Safe testing in isolated environments</strong>: Developers can validate config changes in isolated local or canary environments before they reach production.</li><li><strong>Flexible multi-tenant support</strong>: In a multi-tenant platform, different tenants have different risk profiles. The platform should allow config owners to customize how their configs behave per tenant, including deployment triggers, guardrails, and rollout strategy (for example, AWS zone or Kubernetes pod percentage-based rollouts).</li><li><strong>Fast and controlled incident response</strong>: During an incident, responders can ship emergency configs as needed with clear auditability. The platform also provides observability for config changes, so incident responders can tell what changed, who was affected, when the change was made, and who made the change. This enables them to effectively identify the culprit and take action.</li></ul><h3>High-level architecture</h3><p>At Airbnb, Sitar is the internal name for our dynamic config platform. It provides a common way for teams to manage runtime behavior safely. At a high level, Sitar has four main parts: a developer-facing layer, a control plane, a data plane, and the clients and agents that run alongside application code.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Bv62zjTLRJRCKNdcTgiazA.jpeg"></figure><p>The developer-facing layer is where config changes are created and reviewed. By default, configs are managed through a Git-based workflow, while a few exceptions are managed in the web interface (sitar-portal), which is also used for admin operations such as emergency deployments.</p><p>The control plane is responsible for orchestrating config changes. It enforces schema validation, ownership, and access control, and decides how each change should be rolled out: for example, which environments or AWS zone to target, what percentage of Kubernetes pods to start with, and how to progress the rollout over time. The control plane also specifies how to roll back the changes when needed, and supports routing in-flight configs to specific environments or slices of subscribers for fast testing.</p><p>The data plane provides scalable storage and efficient distribution of configs. It acts as the source of truth for config values and versions, and propagates updates to services reliably, consistently, and quickly.</p><p>On the product services side, an agent sidecar running alongside each service fetches the subscribed configs from the data plane and maintains a local cache. Client libraries inside the service then read from this cache and expose configs to application logic with fast, in-process access and optional fallbacks.</p><p>Putting these together, a typical change starts from a Git flow, proceeds through control-plane validation and rollout decisions, into the data plane for distribution, and finally to agents and client libraries that apply the config updates to application logic.</p><h3>Key design choices</h3><p>In this section, we highlight a few key design choices that shape how the platform looks and is operated.</p><h4><strong>Configs as code with a Git-based workflow</strong></h4><p>Config changes are by default managed by a Git-centric workflow. We use GitHub as the primary interface for managing configs, because we have an established and responsive internal team to manage GitHub Enterprise. GitHub integrates naturally with our existing CI/CD tooling, so we can reuse rich validation and deployment pipelines without re-inventing the wheel. This approach gives developers a consistent experience to make code changes: open a pull request, get reviews, merge, and deploy. GitHub also brings additional benefits such as mandatory reviewers, review and approval flows, and a change history. Configs under the same theme are grouped into tenants, with clear owners, customizable tests, and a dedicated CD pipeline.</p><p>While the Git-based flow is the default, we keep a UI portal for teams that prefer a portal-based experience and as a shortcut for specific operational needs, such as fast emergency config updates that can bypass the normal CI/CD pipeline.</p><h4><strong>Staged rollouts and fast rollbacks</strong></h4><p>When a change is proposed, schema validation (checking that the config matches the expected structure and types) and other automated checks run in CI. The change is always reviewed and approved before rollout.</p><p>Once merged in the main branch, the control plane performs a staged rollout where the change is first deployed to a limited scope, then gradually expanded to a larger scope if things look good. At each stage of this rollout, the change is evaluated, the author and the stakeholders are notified if regressions are detected, and a fast rollback can be triggered if needed. Staged rollouts can greatly reduce the blast radius of bad changes and improve the overall reliability of the platform.</p><h4><strong>Separated control and data planes</strong></h4><p>We separate the “decide” and “deliver” responsibilities. The control plane focuses on validation, authorization, and rollout decisions, while the data plane focuses on storing configs and distributing them reliably at scale. This separation allows us to evolve rollout strategies and policies without disrupting the underlying storage and delivery mechanisms, and vice versa.</p><h4><strong>Local caching and resilient clients</strong></h4><p>On the product services side, we introduced a local caching layer between the agent sidecar and the client library to improve resilience and availability. The agent sidecar runs alongside the main service container, regardless of which language the service is written in, and periodically fetches subscribed configs from the backend and persists them locally. The client libraries then read from this local cache. Even if the backend is temporarily unavailable or degraded, services can continue operating on the last known good configs from the local cache.</p><h3>Impact on product teams</h3><p>It is essential for the Sitar system to make life easier for product teams. In practice, its architecture changes how teams ship and operate in a few ways:</p><ul><li><strong>Rollouts become safer and more predictable.</strong> New behaviors, such as refined authorization rules, can be introduced gradually, verified on a small slice of traffic or in a specific environment, and rolled back quickly if needed. Teams spend less time worrying about “big bang” releases and more time iterating on behavior.</li><li><strong>Teams get more flexibility in how their configs are managed and rolled out.</strong> Each team can tailor a config flow to its own risk profile and release schedules. For example, teams can choose between automatic, manual, or cron rollouts, select the rollout strategy, and add extra checks. This lets teams keep their existing ways of working while still benefiting from a common platform and shared guardrails.</li><li><strong>Incident mitigation becomes faster and more controlled.</strong> When something goes wrong in production, incident responders can use observability tools that integrate config events to quickly locate the culprit change, then take quick action using the portal’s emergency flow. These emergency updates are fully auditable for future review.</li></ul><p>Besides these examples, the platform includes other improvements in usability, safety, and observability that we will not cover in detail here. Together, they contribute to a smoother day-to-day experience for teams that rely on dynamic configuration.</p><h3>Conclusions and next steps</h3><p>Dynamic configuration is a foundational capability of modern infrastructure. It enables fast iteration and rapid incident response, but only when it is equipped with strong safety features and provides a good developer experience. In this post, we shared how we think about a modern dynamic config platform at Airbnb, and how we developed Sitar’s architecture to meet those expectations.</p><p>The work is ongoing. As Airbnb’s business grows, we are continuing to refine rollout strategies, improve config testing, invest in observability and smart incident response tooling, and evolve other platform components.</p><p>In future posts, we plan to dive deeper into specific areas of the platform, such as how we optimize the Kubernetes sidecar that delivers config updates and how we design the developer experience around config management.</p><p>If this type of work interests you check out our <a href="https://careers.airbnb.com/">open roles</a>.</p><h3>Acknowledgments</h3><p>Our progress with Sitar would not have been possible without the support and contributions of many people. We’d like to thank Craig Sosin, Nikolaj Nielsen, Daniel Fagnan, Alex Edwards, Xian Gao, Nick Morgan, Carolina Calderon, Hanfei Lin, Joyce Li, Yunong Liu, Alex Berghage, Brian Wolfe, Yann Ramin, Denis Sheahan, Richa Khandelwal, Swetha Vaidy, Abhishek Parmar, Adam Kocoloski, Adam Miskiewicz, and all the other engineers and teams at Airbnb who joined design reviews and offered valuable feedback, as this work would not have been possible without them.</p><p><em>All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=5aca5222ed68" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/safeguarding-dynamic-configuration-changes-at-scale-5aca5222ed68">Safeguarding Dynamic Configuration Changes at Scale</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/safeguarding-dynamic-configuration-changes-at-scale-5aca5222ed68</link>
      <guid>https://medium.com/airbnb-engineering/safeguarding-dynamic-configuration-changes-at-scale-5aca5222ed68</guid>
      <pubDate>Wed, 18 Feb 2026 18:01:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey to Airbnb — Anna Sulkina]]></title>
      <description><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*eiZyGBIJvx3S5kREFK8wOw.jpeg"></figure><p><em>Anna Sulkina has always been a traveler, and we’re lucky her travels have brought her to Airbnb. Anna is a Senior Director of Engineering, and she’s responsible for Application &amp; Cloud infrastructure. She brings over two decades of industry experience to Airbnb, including work spanning the stack from the frontend to the backend to the plumbing that makes everything come together. Anna is a mother, a passionate trail runner, and an accomplished leader. Here’s Anna’s story in her own words.</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/506/1*S83sUpA6MqrQZ3ToqkXixg.jpeg"></figure><h3>Discovering a passion after the Soviet Union</h3><p>I grew up in Eastern Ukraine, and the year I was graduating from high school, the Soviet Union collapsed. Despite the political turmoil, it was an interesting time to get into technology, and I have my brother to thank for that.</p><p>I was always a nerdy kid, at school and at home, and my older brother really stoked that curiosity. He was studying computer hardware in Moscow, and he’d bring home computer parts to play with. I still remember the first computer he’d assembled, which required a cassette player to load programs. Only after many minutes of buzzing and clicking would the computer finally whirr to life.</p><p>Thinking back, that was really my first inspiration to work in technology. Seeing the inner workings of this new thing, a computer, and watching how the parts came together to form a whole — that’s what made me realize I wanted to work with computers, too.</p><p>Of course, I didn’t know the Soviet Union would end, which made studying in Moscow impossible. But technology was still my future.</p><h3>English: Harder than any programming language</h3><p>I got my start learning programming in a local Ukrainian university, and after four years of studying, I immigrated to America.</p><p>When I arrived, I knew how to program, and I knew how to write and read English, but I couldn’t communicate well. I took ESL classes at a community college and, in parallel, enrolled in Berkeley Extension classes to advance my C++ knowledge and learn Java, which was still very new at the time.</p><p>Throughout my first couple of jobs, I was more likely to run into challenges with the English language rather than with programming languages.</p><p>My first job was in computer hardware diagnostics at a tiny company with only five engineers, where we communicated directly with hardware manufacturers. This was right before the dot-com bubble burst.</p><p>I almost didn’t get the job, though. The interview process for this job included a written portion that tested my knowledge of key computer science terms before getting to solve the coding problems. Given my prior education, I knew all the terms, but I ran out of time because the language gap slowed me down. Luckily, my interviewer happened to be taking the same Java class at Berkeley, and when I explained what happened, he gave me the chance to come back. I finished the test, got the job, and the rest is history.</p><p>In subsequent jobs, I transitioned fully from C++ to Java, which became my primary programming language for many years. I eventually got the hang of speaking English more confidently, but for a while, it still felt like Russian was my first language, Java was my second, and English was only my third.</p><h3>Going deeper in the stack and taking on leadership roles</h3><p>At various times, my career often felt all over the place. But looking back, I see a trajectory I wasn’t aware of at the time. I started with a brief stint in hardware diagnostics, but after that, I worked in the frontend and, over time, descended the software stack from frontend to backend to the deeper infrastructure I work with today.</p><p>Parallel to this trajectory down the stack was an upward trajectory in responsibility. Leadership wasn’t an obvious path for me at first — I had to be pitched multiple times — but the more I tried it, the more interesting and enjoyable it felt.</p><p>When I worked at Caymas Systems, a telecom startup, my manager was quick to recognize my leadership potential. He was really encouraging, but even more encouraging was witnessing the difference between teams with good leaders and those without.</p><p>After Caymas Systems, I worked at Comcast, where I eventually switched from an IC to an engineering manager. Once I experienced the joys of coaching people, building cool software together, and developing high-performing teams, I knew this was the path I wanted to take.</p><h3>Fail whales and distributed systems</h3><p>This path took me through a formative time in my career: the almost nine years I spent working at Twitter. I began as a first-line manager and, over time, worked through some of Twitter’s biggest events, including the “fail whale” era and the Ellen DeGeneres “selfie that broke Twitter” moment.</p><p>This was an exciting time. I was working at the heart of Twitter’s tech stack, supporting teams that powered its consumer and revenue verticals. This is where I grew into a senior manager and, eventually, a director. Looking back over nearly a decade of work, two major lessons stand out: one technical and one cultural.</p><p>The technical lesson was about failure — namely, its inevitability.</p><p>Over my tenure, the Twitter stack transitioned from a monolith to a microservices architecture. This resulted in a set of robust, high-scale, low-latency distributed systems, and it was here that I learned that, when building resilient distributed systems, you need to design <em>for</em> failure, not hope to avoid it.</p><p>I often think back to <a href="https://how.complexsystems.fail/">How Complex Systems Fail</a> — the more complex a system, the more likely it is to fail. I remembered that lesson every time we were called to the Twitter command center to deal with an incident; it was all hands on deck until everything was back online.</p><p>The cultural lesson was about adoption and what it takes to fuel great ideas.</p><p>Today, <a href="https://thenewstack.io/graphql-growth-explodes-but-so-do-problems-federated-graphs-solve/">almost two-thirds of enterprises</a> use GraphQL in production, but in its early days, it was a new, largely untested idea. During a hack week, a couple of engineers laid the groundwork for using this technology at Twitter. I worked closely with them and bootstrapped the team that eventually built Twitter’s GraphQL API, replacing the legacy REST services.</p><p>I still think about this experience today. It required convincing leadership and building consensus across numerous teams and stakeholders, but once we did, the payoff was significant: this one technical choice accelerated the velocity of product feature teams across the company.</p><h3>Why I picked Airbnb</h3><p>When Airbnb reached out in 2022, I realized my time at Twitter was coming to a close. By that point, my organization was well-run and high-performing — a success, but also a sign that I was ready for my next adventure.</p><p>Airbnb immediately stood out because the company offered, for the first time in my career, a true alignment between my personal and technical interests. I love traveling, and I have been a long-time Airbnb guest since 2013. I had always wanted to work for a company that built a product I truly cared about, and this was my chance.</p><p>I only got more excited when I learned about the people and teams I’d be working with. The Developer Platform organization, which was responsible for supporting all of Airbnb’s engineers, faced challenges I’d seen before. There was a lot of good work happening in silos, and folks were longing for a clear strategy and direction. Also, I saw an opportunity to not only improve developer experience but also build trust with the rest of the engineers and stakeholders.</p><p>So, I started at the beginning. We focused on setting up the organization, coaching leadership, and building internal alignment within the team, as well as external alignment across all the teams we supported. Fundamental questions like “Why are we here together?” and “Where are we going?” all had to be answered.</p><p>After a year or two of this work, we had a high-performing team with a clear strategy and strong execution, consistently delivering business value and improving the developer experience and productivity at Airbnb. Even more importantly, we earned the rest of the engineers’ trust, and we enabled our technical teams to perform better.</p><p>We saw this reflected in the bi-annual DevX surveys (which we built out), and the results showed overall developer satisfaction increasing about 10% year over year during my time on the team.</p><h3>Solving new problems while working from anywhere</h3><p>Today, I’m Senior Director of Engineering for Application &amp; Cloud Infrastructure, which includes compute, networking, core services, and the GraphQL application platform. Our mission is to deliver reliable, secure, and efficient platforms for building, operating, and scaling applications, services, and workloads at Airbnb.</p><p>My primary users are still the engineers at Airbnb. When they need to compute, they don’t wrangle AWS themselves — we provide a layer of abstraction that helps them use low-level infrastructure. Similarly, if they need authorization, authentication, configuration management, and a host of other services, they come to us rather than starting from scratch.</p><p>I’m excited to come to work every day because of the people I get to work with and the opportunities we face together. The culture is excellent, the people are smart and collaborative, and the engineers we support appreciate the work we do.</p><p>The setup is empowering, too, and as you solve problems, you can grow and expand to tackle bigger problems that span teams and organizations. Add in the ability to work from anywhere, and for me, it feels like the sky is the limit.</p><p>As I look back on my career, and really, my entire life, I tend to see it now through the lens of long-distance trail running — a major hobby of mine.</p><p>After working at a startup, having twins, and running my first marathon, I felt like I could do anything. At work and on the trails, I think about how to prepare for the journeys ahead and how to maintain a pace that allows me and the people around me to thrive in the long run. Recovery is necessary, but so is strategy, drive, discipline, and finding the people who will go with you as well as cheer you on along the way.</p><p>I’m happy this path, as unpredictable as it has been, has taken me to Airbnb. Airbnb is in that ideal position between a startup and a long-established company. The systems and workflows are mature, but there are still many interesting problems to solve and opportunities to pursue. If that’s of interest to you, I encourage you to check out <a href="http://careers.airbnb.com/">openings</a> at Airbnb.</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=85216183d094" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-anna-sulkina-85216183d094">My Journey to Airbnb — Anna Sulkina</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-anna-sulkina-85216183d094</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-anna-sulkina-85216183d094</guid>
      <pubDate>Wed, 11 Feb 2026 18:02:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey to Airbnb: Peter Coles]]></title>
      <description><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*kQQ1LthGsvtJyUW5bVf0RA.jpeg"></figure><h3>Public school to PhD</h3><p><em>The story of Airbnb’s Head Economist for Policy and Director of Data Science involves geology, co-teaching with a Nobel Prize winner, and CSI. (No, not the hit TV franchise.)</em></p><p><em>Peter Coles was born and raised in Milwaukee, Wisconsin. He studied math at Princeton, earned his PhD in economics at Stanford, and taught at Harvard Business School before joining eBay and becoming a Data Science leader at Airbnb.</em></p><p><em>As you’ll see from his story, Peter has a deep interest in how marketplaces work. By transitioning from academia to the business world, he not only gets to study first-hand data about millions of guests and hosts, but also to influence product and policy decisions. And he still gets to hang out with academics. Check out all the research Peter and his team are doing </em><a href="https://airbnb.tech/research/academic-papers/"><em>here</em></a><em>.</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*MaK3TijQQh8K1OxImjmBWA.png"></figure><p>My fascination with marketplaces goes back a long time.</p><p>Sometime around second grade in Milwaukee, Wisconsin (where I grew up), my friends and I had the great idea to run a rock stand. It was like a lemonade stand, but instead we would sell rocks. Rocks we found in the street. Neighborhood kids could find their own rocks and sell them, and we’d take 25%. Nobody got any sales. Fortunately I’ve learned a bit more about marketplaces since then — more about that shortly.</p><p>From kindergarten through high school, I was a public school kid. My parents valued education, and helping others — my father was a doctor, my mother a nutritionist — and those are values I still hold dear.</p><p>While I played soccer and tennis and was moderately social, this period was probably best defined by an obsession with competitions. Math, Chess, Science Olympiad, Quiz Bowl, Academic Decathlon, puzzle races with my younger brother — there was hardly a nerdy competition offered where I didn’t compete. Time well spent? Let’s just say I missed a lot of high school parties while studying to become the five-time Wisconsin State Rocks, Minerals, and Fossil Identification champion — so you can be the judge.</p><p>By the time college came around, it seemed time to rebel. I wanted to be done with the nerdy stuff. I applied and was accepted to Princeton, and started studying ancient history. After one semester and (in my view) an underappreciated essay on the Hittites, I was back to majoring in math. At least I was good at that! I figured I could work on practical skills later.</p><p>After graduating, I accepted a fellowship in Germany and continued to Stanford for a PhD in economics — a somewhat more applied science, though I focused on game theory, at the intersection of math and strategy. I had the good fortune there to be the second graduate student of Jon Levin, now the President of Stanford University, who taught me the importance of simplification in research — even when the subject matter itself is complex.</p><p>Even while in this still-theoretical space, I kept my feet on the ground — or at least on the pedals. During a monthlong break in my classes in Germany, I biked around Europe, crashing with friends and family members of my classmates — people I had never met before staying with them. In a sense, I was prototyping Airbnb well before it existed!</p><p>My time in graduate school was Silicon Valley in the 2000s, after the dot-com crash, so tech was in the midst of a renaissance. Many of my friends were at growing companies like Google and Amazon. It was very tempting to stay in California to be a part of this, but I ended up with one more stop in academia.</p><h3>Markets in theory, markets in practice</h3><p>Harvard Business School, known for its focus on managerial science, was perhaps the most compelling place in the academic world that would allow me to stay close to the tech industry. I got a double stroke of good fortune: not only was I offered an assistant professorship there, but I also got to co-teach with Al Roth, a founder of the field of Market Design. Al is still an important mentor to me, and later won a Nobel Prize!</p><p>In my time researching and teaching graduate students, I was exposed to many examples of market design, conducting research on the topic of “Matching”; that is, mechanisms to pair users from two groups, often when price cannot be used to clear the market. This covered <a href="https://www.sciencedirect.com/science/article/abs/pii/S0899825614000074">strategy of participants</a>, <a href="https://www.aeaweb.org/articles?id=10.1257/mic.5.2.99">signaling in markets</a>, and I even had a chance to improve the <a href="https://www.aeaweb.org/articles?id=10.1257/jep.24.4.187">market for PhD economists</a>. I also wrote a number of case studies, including on Zillow, Microsoft, Craigslist, and more. The teaching and writing was a lot of fun, but I also came to realize I wasn’t a fit for academia in the long term. My attention span was too short to dedicate most of my time to research papers (and especially peer reviews), but I was enormously appreciative of this phase of my career.</p><p>By this point it was 2013, and two simultaneous and interrelated phenomena were exploding in tech: mobile and the sharing economy. It was a perfect time to head back west and finally enter the tech world.</p><p>I landed at eBay, which for a student of marketplaces was an ideal company: just about everything is for sale, and it was ripe for market design. Steve Tadelis, a mentor from my Stanford days, had created one of the first economics teams inside a tech company, which I took over when Steve left. At the same time, eBay was getting on the data science train — this was before every company had a DS team — and my group joined another to form eBay’s Data Labs. One of my favorite projects there was a project called “What’s it Worth” (which I worked on with Airbnb colleague Dean Chen), where we developed a methodology for determining the fair market value of items. Some hands-on practical work, some modeling — this was just what I was hoping for.</p><p>In 2015 Riley Newman, one of Airbnb’s first employees and then its Head of Data Science, presented an even more enticing opportunity. The Airbnb platform was growing quickly, and for the first time attracting substantial regulatory attention. They needed an economics team to partner with the growing policy team, to jointly address the question of Airbnb’s relationship to cities. This was a new way for me to apply economics. I was all in.</p><h3>Addressing Airbnb’s critical questions with data</h3><p>When I think back to my eight and a half years so far at Airbnb, I view this as entailing three “phases.” In the first, I worked to address economic questions by establishing a global team of data scientists and economists to analyze the relationship of short-term rentals to the world.</p><p>Meanwhile, as Airbnb continued to grow, execs were asking big questions that couldn’t be answered by any specific data science team. They needed a group with visibility across the whole organization. So in this second phase, Jackson Wang and I founded a team called Central Strategy &amp; Insights, or CSI.</p><p>The acronym was no coincidence: we saw ourselves as forensic investigators, piecing together stories as we collected evidence. One important period of CSI’s work addressed changes brought on by the pandemic — in particular a major adjustment in where guests were looking to stay, and the supply we’d need to accommodate them. We also led the company’s business reviews, and generated analyses to describe the business to shareholders ahead of the IPO.</p><p>My third phase at Airbnb started a lot like the first, but supersized: developing models to inform a well-considered approach to policy considerations, this time as travel rebounded after the pandemic and governments were no longer fully occupied with a public health emergency. Our newly expanded group of economics PhDs and analysts also came up with ways to evaluate Airbnb’s impact on guests, hosts, and society, including via our <a href="https://news.airbnb.com/economic-impact-2023-us/">US Economic Impact Report.</a></p><h3>A great balance of academic and applied science</h3><p>Almost all of my first several years at Airbnb had been internally facing. That’s changed in recent times, as we’ve spun up and expanded a program to collaborate with academic researchers to analyze Airbnb’s data and improve the experience for users.</p><p>The first step was to figure out <em>how</em> to collaborate with external researchers, while respecting privacy and legal limitations. Collaboration interest then came quickly. We have now published well-received papers with professors from <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3583836">MIT</a>, Berkeley, <a href="https://arxiv.org/abs/2002.05670">Stanford</a>, <a href="https://www.sciencedirect.com/science/article/pii/S0169207024000049">UCLA</a>, <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3048397">NYU</a> and more, with others in progress. One paper I wrote with colleagues and academic partners develops foundations for what “<a href="https://airbnb.tech/wp-content/uploads/sites/19/2026/01/Quality-Externalities-Full-Paper.pdf">quality</a>” means in platforms, from an economic perspective.</p><p>We’ve also launched a monthly seminar where we invite our academic collaborators to discuss research with Airbnb data scientists and technologists. Developing research is great, but there’s nothing like live discussions to cross-pollinate and foster ideas. This builds on a strong collaborative learning tradition at Airbnb, with internal classes and reading groups to grow our skills and keep up with tech developments.</p><p>Alongside engaging with academia, I’m so excited my data science colleagues and I have a mandate to be innovative and proactive. We have the space and encouragement to work on big ideas, even if they might take a year or two to prove out — and perhaps more importantly, even if some of the ideas fail. But nothing is more important than the people. I am proud of the students, scientists, and even professors I have hired here over the years, and love seeing them grow and find success.</p><p>A license to tackle big topics, continual education, research on the product as well as its relationship to the outside world, amazing colleagues, and a direct connection to academia all make Airbnb a unique place to be a market designer, economist, and data scientist. Whether or not you spent your free time in eighth grade studying rocks.</p><p>If you want to learn more about the research happening at Airbnb you can read our published papers <a href="https://airbnb.tech/research/academic-papers/">here</a>. If this type of work interests you, check out our <a href="https://careers.airbnb.com/positions/">open roles</a>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=0e8efd01c5a8" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-peter-coles-0e8efd01c5a8">My Journey to Airbnb: Peter Coles</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-peter-coles-0e8efd01c5a8</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-peter-coles-0e8efd01c5a8</guid>
      <pubDate>Wed, 28 Jan 2026 19:04:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Pay As a Local]]></title>
      <description><![CDATA[<h3><strong>How Airbnb rolled out 20+ locally relevant payment methods worldwide in just 14 months</strong></h3><p><strong>By: </strong><a href="https://www.linkedin.com/in/gerumhaile"><strong>Gerum Haile</strong></a><strong>, </strong><a href="https://www.linkedin.com/in/bo-shi-0321a693"><strong>Bo Shi,</strong></a><strong> </strong><a href="https://www.linkedin.com/in/yujialiu1992"><strong>Yujia Liu</strong></a><strong>, </strong><a href="https://cn.linkedin.com/in/yanwei-bai-ba5bb315b"><strong>Yanwei Bai</strong></a><strong>, </strong><a href="https://www.linkedin.com/in/boyuan-dev/"><strong>Bo Yuan</strong></a><strong>, </strong><a href="https://www.linkedin.com/in/rory-macqueen-28242ba2/"><strong>Rory MacQueen</strong></a><strong>, </strong><a href="https://www.linkedin.com/in/yixiamao"><strong>Yixia Mao</strong></a></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*6K6D4WxFwqlwdtBc6uzczw.jpeg"></figure><p>Across the more than 220 global markets that Airbnb operates in, cards are the primary way that guests pay for stays, experiences, and services. However, to help make our platform accessible to more people, reduce friction at checkout, and drive more adoption, we introduced trusted, locally preferred payment methods — called local payment methods or LPMs. By offering and supporting these payment methods, Airbnb enables guests everywhere to choose what works best for them.</p><p>In this blog post, we’ll discuss the implementation details behind our Pay as a Local initiative, which allowed us to launch 20+ local payment methods across multiple markets in just over one year.</p><h3>LPMs: What they are, why they matter, and our discovery and selection process</h3><p>Local payment methods go beyond traditional cards and include:</p><ul><li>Country or region-specific digital wallets (such as M-Pesa or MTN, MoMo)</li><li>Online bank transfers (such as Online Banking Czech, Online Banking Slovakia)</li><li>Real-time or instant bank payments (such as PIX, UPI)</li><li>Local payment schemes (such as EFTPOS, Cartes Bancaires)</li></ul><p>By embracing LPMs, Airbnb helps make travel more inclusive and seamless for people around the world. LPMs help the platform to:</p><ul><li><strong>Boost conversion and bookings</strong> by offering guests familiar, trusted payment options.</li><li><strong>Unlock new markets</strong> where credit card usage is low or non-existent.</li><li><strong>Build accessibility</strong> for guests without credit cards or traditional banking access.</li></ul><p>Through our research on local payment methods (LPMs), we identified over 300 unique payment options worldwide. For the initial phase of the LPM initiative, we used a structured qualification framework to select which local payment methods we would support. We evaluated the top 75 travel markets and selected the top one to two payment methods per market — excluding those without a clear travel use case — and arrived at a shortlist of just over 20 LPMs best suited for integration into our payment platform.</p><h3>Background on Airbnb’s payment platform</h3><p>Airbnb’s payments platform is designed to decouple payment logic from the core business (i.e., stays, experiences, and services), allowing for greater flexibility and scalability. The platform efficiently coordinates both guest pay-ins and host payouts by working with regulated payment service providers and financial partners.</p><p>Beyond payment processing, the system also supports robust payment trust and compliance functions.</p><h3>Modernization</h3><p>As part of a multi-year replatforming initiative for our payments architecture called Payments LTA (long-term architecture), we shifted from a monolithic system to a capability-oriented services system structured by domains, using a domain-driven decomposition approach. This modernization approach reduced our time to market, increased reusability and extensibility, and empowered greater team autonomy.</p><p>The core payment domain delivers essential capabilities for pay-in, payout, and payment intermediation. It consists of multiple subdomains, including Pay-in, Payout, Transaction Fulfillment, Processing, Wallet &amp; instruments, Ledger, Incentives &amp; Stored Value, Issuing, and Settlement &amp; Reconciliation.</p><h3>Replatforming as an enabler for local payment method expansion</h3><p>The processing subdomain enables integration with third-party payment service providers (PSPs) and supports API and file-based vendor integration, as well as switching and routing capabilities. As part of our replatforming initiative, we adopted a connector and plugin-based architecture for onboarding new third-party payment service providers. This strategy has significantly reduced the time required to integrate new PSPs in different markets.</p><p>During this replatforming effort, we also introduced <strong>Multi-Step Transactions (MST)</strong>: a processor-agnostic framework that supports payment flows completed across multiple stages. MST defines a PSP-agnostic transaction language to describe the intermediate steps required in a payment, such as submitting supplemental data or handling dynamic interactions. These steps, called Actions<strong>,</strong> can include:</p><ul><li>Redirects</li><li>Strong customer authentication (SCA) frictions (challenges, fingerprinting)</li><li>Payment method — specific flows</li></ul><p>When a PSP indicates that an additional user action is required, its vendor plugin normalizes the request into an ActionPayload and returns it with a transaction intent status of ACTION_REQUIRED. This architecture ensures consistent handling of complex, multi-step payment experiences across diverse PSPs and markets.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ajAC0o7YNT-L-TgI2YVPnw.png"></figure><h3>LPM integration architecture</h3><p>While our modernized payment platform laid the foundation for enabling LPMs, these payment methods come with a unique set of challenges. Many local methods require users to complete transactions in third-party wallet apps. This introduces complexity in app switching, session hand-off, and synchronization between Airbnb and external digital wallets.</p><p>Each local payment vendor also exposes different APIs and behaviors across charge, refund, and settlement flows, making integration and standardization difficult.</p><h3>Technical approach</h3><p>We analyzed the end-to-end behavior of our 20+ LPMs, and identified three foundational payment flows that capture the full spectrum of user and system interactions. By distilling LPM behaviors into these standardized payment flow archetypes, we established a unified framework for integration:</p><ol><li><strong>Redirect flow:</strong> Guests are redirected to a third-party site or app to complete the payment, then return to Airbnb to finalize their booking (e.g., Naver Pay, GoPay, FPX).</li><li><strong>Async flow:</strong> Guests complete payment externally after receiving a prompt (such as a QR code or push notification), and Airbnb receives payment confirmation asynchronously via webhooks (e.g., Pix, MB Way, Blik).</li><li><strong>Direct flow:</strong> Guests enter their payment credentials directly within Airbnb’s interface, allowing real-time processing similar to traditional card payments (e.g., Carte Bancaires, Apple Pay).</li></ol><p>This standardized approach has enabled significant reusability across integrations and substantially reduced the engineering effort required to support new payment methods.</p><h3>Asynchronous payment orchestration</h3><p>Since many guests complete payments through external providers, we redesigned our payment orchestration — building on top of MST — to support payment flows that require user actions outside Airbnb (redirect flows and async flows).</p><p>For redirect flows, where guests complete the payment on a third-party app or website:</p><ul><li>Airbnb’s payments platform sends a charge request to the local payment vendor, whose response includes a redirectUrl.</li><li>Our platform redirects the user to the external app or website to complete the payment.</li><li>Once the payment is successfully completed, the user is redirected back to Airbnb with a result token. Airbnb’s payments platform then uses this token to securely confirm and finalize the payment with the local processor.</li></ul><p>For async flows (which typically involve scanning a QR code):</p><ul><li>Airbnb’s payments platform sends a charge request to the local payment vendor, whose response includes a qrCodeData.</li><li>The checkout page displays the QR code for the user to scan and complete the payment in their wallet app.</li><li>After the payment succeeds, the vendor sends a webhook notification to Airbnb’s payments platform, which updates the payment status to success and confirms the user’s order.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*RVumwnLlf9S8DcwfjZxQAg.png"></figure><h4>Naver Pay: Redirect To Naver Pay Website</h4><p>Naver Pay is one of the fastest-growing digital payment methods in South Korea. As of early 2025, it has reached over 30.6 million active users, representing approximately 60% of the South Korean population. Enabling Naver Pay in the South Korean market not only helps deliver a more seamless and familiar payment experience for local guests, but also expands Airbnb’s reach to new users who prefer using Naver Pay as their primary payment method.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/265/1*xmEalXsiRw5vzv0cMyCe5g.gif"></figure><h4>Pix: Scan A QR Code</h4><p><a href="https://news.airbnb.com/br/airbnb-anuncia-parcelamento-sem-juros-para-pagamento-de-estadias/">Pix</a> is an instant payment system developed by the Central Bank of Brazil, enabling 24/7 real-time money transfers through methods such as QR codes or Pix keys. Its adoption has been extraordinary — by late 2024, more than 76% of Brazil’s population was using Pix, making it the country’s most popular payment method, surpassing cash, credit, and debit cards. In 2024 alone, Pix processed over BRL 26.4 trillion (approximately USD 4.6 trillion) in transaction volume, underscoring its pivotal role in Brazil’s digital payment ecosystem.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/864/1*jqRjCcvFGvlIw1T3NT13Wg.gif"></figure><h3>Config-driven payment method integration</h3><p>Airbnb embraced a config-driven approach, powered by a central YAML-based Payment Method Config that acts as a single source of truth for flows, eligibility, input fields, refund rules, and more. Instead of scattering payment method logic across the frontend, backend, and various services, we consolidate all relevant details in this config. Both core payment services and frontend experiences dynamically reference this single source of truth, ensuring consistency for eligibility checks, UI rendering, and business rules. This unified approach dramatically reduces duplication, manual updates, and errors across the stack, making integration and maintenance faster and more reliable.</p><p>These configs also drive automated code generation for backend services using code generation tools, producing Java classes, DTOs, enums, schema, and integration scaffolding. As a result, integrating or updating a payment method is largely declarative — just a config change. This streamlines launches from months to weeks and makes ongoing maintenance far simpler.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*2elXD-3BHco1AqeSAXPQDg.png"></figure><h4>Payment widget</h4><p>Our payment widget — the payment method UI embedded into the checkout page — includes the list of available payment methods and handles the user’s inputs. Local payment methods often require specialized input forms (such as CPF for Pix) and have unique country/currency eligibility.</p><p>Rather than hardcoding forms and rules into the client, we centralize both form-field specification and eligibility checks in the backend. Servers send configuration payloads to clients defining exactly which fields to collect, which validation rules to apply, and which payment options to render. This empowers the frontend to dynamically adapt UI and validation for each payment method, accelerating launches and keeping user experiences fresh without frequent client releases.</p><p>For example, Pix in Brazil requires the guest’s first name, last name, and CPF (tax ID), which we collect and transmit as required to complete the payment.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/764/1*h58IqdWRxO_5U7ZrD_cMDA.png"></figure><p>Below is a diagram illustrating how dynamic payment method configurations are delivered from the backend to the frontend, enabling tailored checkout presentations for each payment method.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*-K2aydrTL6jJGHouKu1M2Q.png"></figure><h3>Building confidence through better testability</h3><p>Testing local payment methods can be difficult, because developers often don’t have access to local wallets. Yet with such a broad range of payment methods and complex flows, comprehensive testing is essential to prevent regressions and ensure seamless functionality.</p><p>To address this, we enhanced Airbnb’s in-house <strong>Payment Service Provider (PSP) Emulator</strong>, enabling realistic simulation of PSP interactions for both redirect and asynchronous payment methods. The Emulator allows developers to test end-to-end payment scenarios without relying on unstable (or nonexistent) PSP sandboxes. For redirect payments, the Emulator provides a simple UI mirroring PSP acquirer pages, allowing testers to explicitly approve or decline transactions for precise scenario control. For async methods, it returns QR code details and automatically schedules webhook emission tasks upon receiving a /payments request — delivering a complete, reliable testing environment across diverse LPMs.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ATFCaiOr1uS11DfoO2q5_Q.png"></figure><h3>Scaling observability for local payment methods</h3><p>Maintaining high reliability and availability is critical for Airbnb’s global payment system. As we expand to support many new local payment methods, we face increasing complexity: greater dependencies on external PSPs and wide variations in payment behaviors. For example, a real-time card payment and a redirect flow like Naver Pay follow completely different technical paths. That diversity makes observability difficult — a single “payment success rate” may represent card health well, but say little about an asynchronous LPM. Without proper visibility, regressions can go unnoticed until they affect real users. As dozens of new LPMs go live, observability has become the foundation of reliability.</p><p>To address this, we built a centralized monitoring framework that unifies metrics across all layers, from client to PSP. When launching a new LPM, onboarding now requires a single config change; add the method name, and metrics begin streaming automatically:</p><ul><li><strong>Client metrics</strong> — user-level flow health from clients</li><li><strong>Payment backend metrics</strong> — API-level metrics for payment flows</li><li><strong>PSP metrics</strong> — API-level visibility between Airbnb and the PSP</li><li><strong>Webhook metrics</strong> — async completion status for redirect methods or refunds</li></ul><p>We have also standardized the alerting rules across our platform’s Client, Backend, PSP, and Webhook layers using composite alerts and anomaly detection. Each alert follows a consistent pattern (failure count, rate, time window), e.g., “Naver Pay resume failures &gt; 5 and failure rate &gt; 20% in 30 minutes.” This design minimizes false positives during low-traffic periods.</p><p>This framework scales effectively, providing end-to-end visibility from user click to PSP confirmation. It enables engineers to trace issues in minutes rather than hours, whether those issues were caused by internal changes or external outages. By turning observability into a shared, automated layer, we were able to strengthen the backbone of payment reliability while accelerating the rollout of new LPMs worldwide.</p><h3>Impact</h3><p>The Pay as a Local initiative delivered significant business and technical impact:</p><ul><li><strong>Meaningful booking uplift:</strong> We observed<strong> meaningful uplift </strong>in bookings and new users in markets where we launched local payment methods</li><li><strong>Faster integrations:</strong> Reduced integration time <strong>significantly </strong>through reusable flows and config-driven automation.</li><li><strong>Stronger reliability:</strong> Improved observability for early outage detection, standardized testing to prevent regressions, and streamlined vendor escalation and on-call processes for global resilience.</li></ul><h3>Conclusion</h3><p>Supporting local payment methods helps Airbnb to stay competitive and relevant in the global travel industry. These payment options help improve checkout conversion, drive adoption, and unlock new growth opportunities.</p><p>This post outlined how the<strong> </strong>Airbnb payment platform has evolved to support local payment methods at scale — through asynchronous payment orchestration, config-driven onboarding, centralized observability, and<strong> </strong>robust testability. Together, these capabilities enable faster integrations, lower maintenance overhead, and offer a more seamless, localized checkout experience for guests worldwide.</p><p>As Airbnb continues to expand globally, our payments platform will keep evolving with the same principles of extensibility, reliability, and scalability, ensuring that guests everywhere can pay confidently, using the methods they know and trust.</p><h3>Acknowledgments</h3><p>We had many people at Airbnb contributing to this big rearchitecture, but countless thanks to <a href="mailto:mini.atwal@airbnb.com">Mini Atwal</a>, <a href="mailto:ashish.singla@airbnb.com">Ashish Singla</a>, <a href="mailto:musaab.attaras@airbnb.com">Musaab At-Taras</a>,<a href="mailto:linmin.yang@airbnb.com">Linmin Yang</a>, <a href="mailto:yong.rhyu@airbnb.com">Yong Rhyu</a>, <a href="mailto:yohannes.tsegay@airbnb.com">Yohannes Tsegay</a>, <a href="mailto:livar.cunha@airbnb.com">Livar Cunha</a>, <a href="mailto:praveena.subrahmanyam@airbnb.com">Praveena Subrahmanyam</a>, <a href="mailto:steve.ickes@airbnb.com">Steve Ickes</a>, <a href="mailto:vijaykumar.borkar@airbnb.com">Vijaykumar Borkar</a>, Vibhu Ramani, Aashna Jain, Abhishek Ghosh, Abhishek Patel, Adithya Tammavarapu, Akai Hsieh, Akash Budhia, Amar Parkash, Amee Mewada, Ankita Balakrushan Tate, Bharath Kumar Chandramouli, Bo Shi, Bo Yuan. Callum Li. Carlos Townsend Pico, Chanakya Daparthy, Charles Tang, Cibi Pari, Cindy Jaimez, Cindy Shi, Dan Yo, Daniela Nobre, Danielle Zegelstein, David Cordoba, David Drinan, Dawei Wang, Dechuan Xu, Denise Francisco, Denny Liang, Dimi Matcovschi, Divya Verma, Feifeng Yang, Gabriel Siqueira, Sunny Wallia, Prashant Jamlakar, Daniel Kriske, Giovanni Iniguez, Haojie Zhang, Haokun Chen, Haoti Zhong, Harriet Russell, Harshit Gupta, Henrique Moreira Indio do Brasil, Ishan Ishan, Jenny Shen, Jerroid Marks, Jiafang Jiang, Joey Yin, Jon Chew, Karen Kuo, Katie Turley, Letian Zhang, Maneesh Lall, Manish Singhal, Maria Daneri, Mark Jang, Mengfei Ren, Michelle Desiderio, Mohit Dhawan, Nam Kim, Nerea Ruiz Alvarez, Nikita Kapoor, Oliver Zhang, Omer Faruk Gul, Pallavi Sharma, Prateek Sri, Rae Huang, Rohit Krishnan Dandayudham, Rory MacQueen, Ruize Liu, Sam Bitter, Sam Tang, Saran Singh. Sardana Sai Anil, Serdar Yildirim, Shwetha Saibanna, Silvia Crespo Sanchez, Simon Xia, Stella Dong, Stella Su, Stephanie Leung, Steve Cao, Sumit Ranjan, Tay Rauch, Thanigaivelan Manickavelu, Tiffany Selby, Toland Hon, Trish Burgess,Vishal Garg, Vivian Lue, Vyom Rastogi, William Betz, Xi Wen, Xing Xing, Xuanxuan Wu, Yangguang Li, Yanwei Bai, Yeung Song, Yixia Mao, Yujia Liu. Yun Cho, Zhenhui Zhu, Ziyun Ye</p><h3>****************</h3><p><em>All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=bef469b72f32" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/pay-as-a-local-bef469b72f32">Pay As a Local</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/pay-as-a-local-bef469b72f32</link>
      <guid>https://medium.com/airbnb-engineering/pay-as-a-local-bef469b72f32</guid>
      <pubDate>Mon, 12 Jan 2026 19:02:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[GraphQL Data Mocking at Scale with LLMs and @generateMock]]></title>
      <description><![CDATA[<div class="id ic iw ix iy"><div class="ac ci"><div class="cp bi ij ik il im"><div><div></div></div></div></div><div class="ac ci oi oj ok ol" role="separator"><div class="id ic iw ix iy"><div class="ac ci"><div class="cp bi ij ik il im"><p id="41f0" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl"><em class="pk">How Airbnb combines GraphQL infra, product context, and LLMs to generate and maintain convincing, type-safe mock data using a new directive.</em></p><figure class="po pp pq pr ps pt pl pm paragraph-image"><div role="button" tabindex="0" class="pu pv fr pw bi px">Press enter or click to view image in full size<div class="pl pm pn"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*wHF0IXmtHzIfzv-IZh_CKA.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h2 id="2837" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">Introduction</h2><p id="dee9" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl">Producing valid and realistic mock data for testing and prototyping with GraphQL has been a persistent challenge across the industry for years. Mock data is tedious to write and maintain, and attempts to improve the process, such as random value generation and field-level stubbing, fall short because they lack essential domain context to make test data realistic and meaningful. The time spent on this manual work ultimately takes away from what most engineers would like to focus on: building features.</p><p id="3506" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">In this post, we’ll explore how we’ve reimagined mocking GraphQL data at Airbnb by combining GraphQL validation, rich product and schema context, and LLMs to generate and maintain convincing, type-safe mock data. Our solution centers around a simple new GraphQL client directive — @generateMock — that engineers can add to any operation, fragment, or field. This approach eliminates the need for engineers to manually write and maintain mocks as queries evolve, freeing up time to focus on building the product.</p><h2 id="ff28" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">Key challenges</h2><p id="d057" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl">After meeting with Airbnb product engineers and analyzing results from internal surveys, we distilled the most common pain points around GraphQL mocking down into three key challenges:</p><ol class=""><li id="e744" class="op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj ra rb rc bl"><strong class="or jc">Manually creating mocks is time consuming.</strong> GraphQL queries can grow to hundreds of lines, and hand-crafting mock response data is extremely tedious. Most engineers manually write mocks as either raw JSON files or by instantiating types generated from the GraphQL schema, while others modify copy-and-pasted JSON responses from the server. Although both of these methods can yield realistic-looking data that can be used for demos and snapshot tests, they require significant time investment and are prone to subtle mistakes.</li><li id="9081" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj ra rb rc bl"><strong class="or jc">Prototyping &amp; demoing features without the server is hard.</strong> Typically, server and client engineers agree on a GraphQL schema early on in the feature development process. Once the schema has been established, however, the two groups split off and start working in parallel: Server engineers implement the logic to back the new schema and client engineers build the frontend UI, logic, and the queries that power them. This parallelization is particularly challenging for client engineers, since they can’t actually test the UI they’re building until the server has fully implemented the schema. To unblock themselves, client engineers often hardcode data into views, leverage proxies to manipulate responses, or hack custom logic into the networking layer locally, resulting in wasted time and effort.</li><li id="8e4a" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj ra rb rc bl"><strong class="or jc">Mocks get out of sync with GraphQL queries over time.</strong> Since most mocks are hand-written, they are not tightly coupled to the underlying queries and schema they are supposed to represent. If a team builds a new feature, then comes back a few months later to add new functionality backed by additional GraphQL fields, engineers must remember to manually update their mock data. As there is no forcing function to guarantee mocks stay in sync with queries, mock data tends to shift further away from the production reality as time passes — degrading the quality of tests.</li></ol><p id="2e68" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">These challenges are not unique to Airbnb and are common across the industry. Although tooling like random value generators and local field resolvers can provide some assistance, they lack the domain knowledge and context needed to produce realistic, meaningful data for high-quality demos, quick product iteration, and reliable testing.</p><h2 id="f20e" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">Goals</h2><p id="3ba3" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl">When setting out to solve these challenges at Airbnb, we established three north-star goals:</p><ol class=""><li id="6805" class="op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj ra rb rc bl"><strong class="or jc">Eliminate the need to hand-write mock data.</strong> Mock data should be generated automatically to free up engineers from needing to hand-craft and maintain mock GraphQL data.</li><li id="bafa" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj ra rb rc bl"><strong class="or jc">Create highly realistic mock data.</strong> Mock data should match the user interface designs and look like real production data in order to support high-quality demos, which are highly valued at Airbnb for early feedback.</li><li id="bf4a" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj ra rb rc bl"><strong class="or jc">Keep engineers in their local focus loops.</strong> Our solution should seamlessly integrate into engineers’ current development processes so they can generate mocks without context-switching to a website, separate repository, or unfamiliar tool.</li></ol><h2 id="57fa" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">@generateMock: Schema + context + LLMs = magic</h2><p id="3542" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl">To generate mock data while keeping engineers in their local focus loops, we introduced a new client GraphQL directive called @generateMock, which engineers can use to automatically generate mock data for a given GraphQL operation, fragment, or field:</p><figure class="po pp pq pr ps pt pl pm paragraph-image"><div role="button" tabindex="0" class="pu pv fr pw bi px">Press enter or click to view image in full size<div class="pl pm ri"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*llh_79Y2eUguxztCsbzIkg.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*llh_79Y2eUguxztCsbzIkg.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*llh_79Y2eUguxztCsbzIkg.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*llh_79Y2eUguxztCsbzIkg.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*llh_79Y2eUguxztCsbzIkg.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*llh_79Y2eUguxztCsbzIkg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*llh_79Y2eUguxztCsbzIkg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*llh_79Y2eUguxztCsbzIkg.png 640w, https://miro.medium.com/v2/resize:fit:720/1*llh_79Y2eUguxztCsbzIkg.png 720w, https://miro.medium.com/v2/resize:fit:750/1*llh_79Y2eUguxztCsbzIkg.png 750w, https://miro.medium.com/v2/resize:fit:786/1*llh_79Y2eUguxztCsbzIkg.png 786w, https://miro.medium.com/v2/resize:fit:828/1*llh_79Y2eUguxztCsbzIkg.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*llh_79Y2eUguxztCsbzIkg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*llh_79Y2eUguxztCsbzIkg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rj fm rk pl pm rl rm bg b bh ab eb">Example of @generateMock being specified on a GraphQL query.</figcaption></figure><p id="133b" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">This directive accepts a few optional arguments that engineers can use to customize the generated mock data, and the directive itself can be repeated with different input arguments to generate different mock variations:</p><ul class=""><li id="e811" class="op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj rn rb rc bl"><em class="pk">id</em>: The identifier to use for the mock, as well as for naming generated helper functions. Useful when repeating the @generateMock directive to produce multiple mocks.</li><li id="0d83" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl"><em class="pk">hints</em>: Additional context or instructions on how the mock should look. For example, a hint might be “Include travel entries for Barcelona, Paris, and Kyoto.” Under the hood, this information is fed to an LLM and heavily influences what the generated mock data looks like and how densely populated its fields are.</li><li id="56da" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl"><em class="pk">designURL</em>: The URL of a design mockup of the screen that will render the mock data. Specifying this argument helps the LLM produce mock data that matches the design by generating matching names, addresses, and other similar content.</li></ul><p id="5dee" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">At Airbnb, engineers use a command line tool we call Niobe to generate code for their GraphQL queries and fragments. After modifying a .graphql file locally, engineers run this code generator, then use the generated TypeScript/Kotlin/Swift files to send GraphQL requests. To generate mock data using @generateMock, engineers simply need to run Niobe code generation after adding the directive — just as they would after making any other GraphQL change.</p><p id="64cf" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">During code generation, Niobe produces both a JSON file containing the actual mock data for each @generateMock directive, as well as a source file that provides functions for loading and consuming mock data from demo apps, snapshot tests, and unit tests. As shown in the Swift code below, the <em class="pk">mockMixedStatusIndicators()</em> function is generated on the InboxSyncQuery’s root <em class="pk">Data</em> type. It provides access to an instantiated type that’s populated with the generated mock data for <em class="pk">mixed_status_indicators</em>, allowing engineers to use the mock without having to load the JSON data manually:</p><figure class="po pp pq pr ps pt pl pm paragraph-image"><div role="button" tabindex="0" class="pu pv fr pw bi px">Press enter or click to view image in full size<div class="pl pm ro"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*6stJEGGeLtkfKK_D6ulaoA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*6stJEGGeLtkfKK_D6ulaoA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*6stJEGGeLtkfKK_D6ulaoA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*6stJEGGeLtkfKK_D6ulaoA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*6stJEGGeLtkfKK_D6ulaoA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*6stJEGGeLtkfKK_D6ulaoA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*6stJEGGeLtkfKK_D6ulaoA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*6stJEGGeLtkfKK_D6ulaoA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*6stJEGGeLtkfKK_D6ulaoA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*6stJEGGeLtkfKK_D6ulaoA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*6stJEGGeLtkfKK_D6ulaoA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*6stJEGGeLtkfKK_D6ulaoA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*6stJEGGeLtkfKK_D6ulaoA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*6stJEGGeLtkfKK_D6ulaoA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rj fm rk pl pm rl rm bg b bh ab eb">Using a generated mock in a Swift unit test.</figcaption></figure><p id="34b1" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">Engineers are free to modify the generated mock JSON data as well — as we’ll see below, Niobe will avoid overwriting their modifications on subsequent generation invocations.</p><h2 id="913c" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">What does mock data look like?</h2><p id="8059" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl">The context that we provide to the LLM is vital to generating data that is realistic enough to use in demos. To this end, Niobe collects the following information and includes it in the context passed to the LLM:</p><ul class=""><li id="da32" class="op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj rn rb rc bl">The definitions of the query/fragment/fields being mocked (i.e., those marked with @generateMock and their dependencies).</li><li id="e33f" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl">The <em class="pk">subset</em> of the GraphQL schema being queried, as well as any associated documentation that is present as inline comments. This information enables the LLM to infer the types that are used by the query being mocked. Importantly, this isn’t the <em class="pk">whole</em> schema, because including the full schema would likely overload the context window — Niobe traverses the schema and strips out types and fields that are not needed to resolve the query, along with any extra whitespace.</li><li id="3c6e" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl">The URL for the image representation of the design document specified within <em class="pk">designURL</em>, if any. Niobe integrates with an internal API to generate a snapshot image of the provided node in the design document. The API pushes this snapshot to a storage bucket and provides a URL that Niobe feeds to the LLM, along with specialized instructions on how to use it.</li><li id="381b" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl">The additional <em class="pk">hints</em> specified in @generateMock.</li><li id="f8bb" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl">The platform (e.g., “iOS”, “Android”, or “Web”) for which the mock data is being generated (for style specificity).</li><li id="17da" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl">A list of Airbnb-hosted image URLs that the LLM can choose from if needed, along with short textual descriptions of each. This prevents the LLM from hallucinating image URLs that don’t exist and ensures that the mock data contains valid URLs which can be properly loaded at runtime when prototyping or demoing.</li></ul><figure class="po pp pq pr ps pt pl pm paragraph-image"><div role="button" tabindex="0" class="pu pv fr pw bi px">Press enter or click to view image in full size<div class="pl pm rp"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*yPJB6saOgVZ7kQErYjGtWQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*yPJB6saOgVZ7kQErYjGtWQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*yPJB6saOgVZ7kQErYjGtWQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*yPJB6saOgVZ7kQErYjGtWQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*yPJB6saOgVZ7kQErYjGtWQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*yPJB6saOgVZ7kQErYjGtWQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*yPJB6saOgVZ7kQErYjGtWQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*yPJB6saOgVZ7kQErYjGtWQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*yPJB6saOgVZ7kQErYjGtWQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*yPJB6saOgVZ7kQErYjGtWQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*yPJB6saOgVZ7kQErYjGtWQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*yPJB6saOgVZ7kQErYjGtWQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*yPJB6saOgVZ7kQErYjGtWQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*yPJB6saOgVZ7kQErYjGtWQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rj fm rk pl pm rl rm bg b bh ab eb">Illustration of the various pieces of context that are passed to the LLM during mock generation.</figcaption></figure><p id="dae2" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">All this information is consolidated into a prompt we fine-tuned against Gemini 2.5 Pro. We chose this model because of its 1-million token context window, plus the fact that in our internal tests this configuration performed significantly faster than comparable models while producing mock data of similar quality. Using this approach, we’re able to produce highly realistic JSON mocks which, when loaded into the application, yield very convincing results as shown below:</p><figure class="po pp pq pr ps pt pl pm paragraph-image"><div role="button" tabindex="0" class="pu pv fr pw bi px">Press enter or click to view image in full size<div class="pl pm rq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*1PXeYCxy9BVSWYtOI7edHQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*1PXeYCxy9BVSWYtOI7edHQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*1PXeYCxy9BVSWYtOI7edHQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*1PXeYCxy9BVSWYtOI7edHQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*1PXeYCxy9BVSWYtOI7edHQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*1PXeYCxy9BVSWYtOI7edHQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*1PXeYCxy9BVSWYtOI7edHQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*1PXeYCxy9BVSWYtOI7edHQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*1PXeYCxy9BVSWYtOI7edHQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*1PXeYCxy9BVSWYtOI7edHQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*1PXeYCxy9BVSWYtOI7edHQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*1PXeYCxy9BVSWYtOI7edHQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*1PXeYCxy9BVSWYtOI7edHQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*1PXeYCxy9BVSWYtOI7edHQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rj fm rk pl pm rl rm bg b bh ab eb">Screenshot of a design mockup compared to a mock that was generated using @generateMock.</figcaption></figure><p id="51bf" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">The data in the screenshot on the right looks quite realistic, but if you look closely you may notice that the data is indeed mocked — all the photos are coming from the seed data set that we feed the LLM.</p><h2 id="066e" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">How it works</h2><p id="e458" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl">When an engineer uses the Niobe CLI to generate code for their GraphQL files, Niobe automatically performs mock generation as the final step of this process, as shown in the flowchart below:</p><ul class=""><li id="e6ee" class="op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj rn rb rc bl">If the @generateMock directive includes a <em class="pk">designURL</em>, Niobe validates the URL to ensure it includes a <em class="pk">node-id</em>, then uses an internal API to produce an image snapshot of that particular node. The API, in turn, pushes this snapshot to a storage bucket and provides Niobe with its URL.</li><li id="9e25" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl">Next, the CLI aggregates all the context described in the section above — including the URL of the design snapshot — and crafts a prompt to send to the LLM. This prompt is then sent to the Gemini 2.5 Pro model, and results are streamed back to the client in order to show a progress indicator in the CLI.</li><li id="52dc" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl">Once the mock JSON response has been received from the LLM, Niobe performs a validation step against this data by passing the GraphQL schema, client GraphQL document, and JSON data to the graphql <a class="ah ib" href="https://www.npmjs.com/package/graphql" rel="noopener ugc nofollow" target="_blank">NPM package</a>’s <em class="pk">graphqlSync</em> function.</li><li id="a5ed" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj rn rb rc bl">If the validation produces errors (for example, if the LLM hallucinated an invalid enum value or failed to populate a required field), Niobe aggregates these errors and feeds them back into the LLM along with the initial mock data. This retry mechanism is used to essentially “self-heal” and fix invalid mock data.<br />– This step is <em class="pk">critical</em> to reliably generating mock data. By placing the LLM within our existing GraphQL infrastructure, we’re able to enforce a set of guardrails through this validation step and provide strong guarantees that the mock data produced at the end of the pipeline is fully valid — something that wouldn’t be possible by using a tool outside our GraphQL infrastructure like ChatGPT.<br />– Finally, once the mock data has been validated, Niobe writes it to a JSON file, alongside a companion source file which provides functions for loading the mock from application code.</li></ul><figure class="po pp pq pr ps pt pl pm paragraph-image"><div role="button" tabindex="0" class="pu pv fr pw bi px">Press enter or click to view image in full size<div class="pl pm rr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*i1gTnZfrP0qZmQrn3N-jkQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*i1gTnZfrP0qZmQrn3N-jkQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*i1gTnZfrP0qZmQrn3N-jkQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*i1gTnZfrP0qZmQrn3N-jkQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*i1gTnZfrP0qZmQrn3N-jkQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*i1gTnZfrP0qZmQrn3N-jkQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*i1gTnZfrP0qZmQrn3N-jkQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*i1gTnZfrP0qZmQrn3N-jkQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*i1gTnZfrP0qZmQrn3N-jkQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*i1gTnZfrP0qZmQrn3N-jkQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*i1gTnZfrP0qZmQrn3N-jkQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*i1gTnZfrP0qZmQrn3N-jkQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*i1gTnZfrP0qZmQrn3N-jkQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*i1gTnZfrP0qZmQrn3N-jkQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rj fm rk pl pm rl rm bg b bh ab eb">Flowchart of how mock generation works under the hood.</figcaption></figure><h2 id="ff6b" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">@respondWithMock: Unblocking client development</h2><p id="ade7" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl">In addition to generating realistic mock data with @generateMock, we also wanted to empower client engineers to iterate on features without waiting for the backend server implementation. A second directive, @respondWithMock, works alongside @generateMock to make this possible:</p><figure class="po pp pq pr ps pt pl pm paragraph-image"><div role="button" tabindex="0" class="pu pv fr pw bi px">Press enter or click to view image in full size<div class="pl pm ri"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*GDhohfDu-AgMKgN3yBH4LA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*GDhohfDu-AgMKgN3yBH4LA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*GDhohfDu-AgMKgN3yBH4LA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*GDhohfDu-AgMKgN3yBH4LA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*GDhohfDu-AgMKgN3yBH4LA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*GDhohfDu-AgMKgN3yBH4LA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*GDhohfDu-AgMKgN3yBH4LA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*GDhohfDu-AgMKgN3yBH4LA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*GDhohfDu-AgMKgN3yBH4LA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*GDhohfDu-AgMKgN3yBH4LA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*GDhohfDu-AgMKgN3yBH4LA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*GDhohfDu-AgMKgN3yBH4LA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*GDhohfDu-AgMKgN3yBH4LA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*GDhohfDu-AgMKgN3yBH4LA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rj fm rk pl pm rl rm bg b bh ab eb">Simple example of using the @respondWithMock directive.</figcaption></figure><p id="4429" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">When this directive is present, Niobe alters the code that’s generated alongside the mock data to include extra details about this annotation. At runtime, the GraphQL client uses this to load the generated mock data, then seamlessly returns the mocked response <em class="pk">instead</em> of using data from the server. This effectively allows client engineers to unblock themselves from waiting on the server implementation, since they can easily use locally mocked data when querying unimplemented fields. The screenshot of the inbox screen earlier in this post is actually a real screenshot that was taken by generating with these two directives and running the Airbnb app in an iOS simulator — no manual mocking, proxying, or response modification needed!</p><p id="fe6e" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">@respondWithMock can also be specified on <em class="pk">individual fields</em>. When used on fields within a query instead of on the query itself, the GraphQL client will actually request all fields from the server <em class="pk">except those annotated with @respondWithMock</em>, then patch in locally mocked data for the remaining fields — producing a hybrid of production and mock data, and making it possible for client engineers to develop against new (unimplemented) fields in existing queries. Engineers can even repeat this directive and use query input variables to decide if and when to return a specific generated mock at runtime, as shown below:</p><figure class="po pp pq pr ps pt pl pm paragraph-image"><div role="button" tabindex="0" class="pu pv fr pw bi px">Press enter or click to view image in full size<div class="pl pm rs"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*6JRec3pvDGoc8_nMYS7HIg.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*6JRec3pvDGoc8_nMYS7HIg.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*6JRec3pvDGoc8_nMYS7HIg.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*6JRec3pvDGoc8_nMYS7HIg.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*6JRec3pvDGoc8_nMYS7HIg.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*6JRec3pvDGoc8_nMYS7HIg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*6JRec3pvDGoc8_nMYS7HIg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*6JRec3pvDGoc8_nMYS7HIg.png 640w, https://miro.medium.com/v2/resize:fit:720/1*6JRec3pvDGoc8_nMYS7HIg.png 720w, https://miro.medium.com/v2/resize:fit:750/1*6JRec3pvDGoc8_nMYS7HIg.png 750w, https://miro.medium.com/v2/resize:fit:786/1*6JRec3pvDGoc8_nMYS7HIg.png 786w, https://miro.medium.com/v2/resize:fit:828/1*6JRec3pvDGoc8_nMYS7HIg.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*6JRec3pvDGoc8_nMYS7HIg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*6JRec3pvDGoc8_nMYS7HIg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rj fm rk pl pm rl rm bg b bh ab eb">Using @respondWithMock with conditionals and on individual fields.</figcaption></figure><h2 id="6ad4" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">Schema evolution: Keeping mocks truthful</h2><p id="271a" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl">The final challenge we addressed was the issue of keeping mocks in sync with queries as they evolve over time. Since Niobe manages mock data that is generated via the @generateMock directive, it can be smart about maintaining that mock data. As part of mock generation, Niobe embeds two extra keys in each generated JSON file:</p><ol class=""><li id="5633" class="op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj ra rb rc bl">A hash of the client entity being mocked (i.e., the GraphQL query document).</li><li id="e526" class="op oq jb or b os rd ou ov ow re oy oz hb rf pb pc he rg pe pf hh rh ph pi pj ra rb rc bl">A hash of the input arguments to @generateMock.</li></ol><figure class="po pp pq pr ps pt pl pm paragraph-image"><div role="button" tabindex="0" class="pu pv fr pw bi px">Press enter or click to view image in full size<div class="pl pm rt"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*P0Fps2QTSpsnTP_shlqNfQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*P0Fps2QTSpsnTP_shlqNfQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*P0Fps2QTSpsnTP_shlqNfQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*P0Fps2QTSpsnTP_shlqNfQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*P0Fps2QTSpsnTP_shlqNfQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*P0Fps2QTSpsnTP_shlqNfQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*P0Fps2QTSpsnTP_shlqNfQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*P0Fps2QTSpsnTP_shlqNfQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*P0Fps2QTSpsnTP_shlqNfQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*P0Fps2QTSpsnTP_shlqNfQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*P0Fps2QTSpsnTP_shlqNfQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*P0Fps2QTSpsnTP_shlqNfQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*P0Fps2QTSpsnTP_shlqNfQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*P0Fps2QTSpsnTP_shlqNfQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rj fm rk pl pm rl rm bg b bh ab eb">Niobe embeds version hashes in mock data in order to determine when a given mock needs to be updated.</figcaption></figure><p id="499b" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">Each time code generation runs, Niobe determines whether existing mocks’ hashes differ from what their current hashes should be based on the GraphQL document. If they match, it skips mock generation for those types. On the other hand, if one of the hashes changed, Niobe intelligently updates that mock by including the existing mock in the context provided to the LLM, along with instructions on how to modify it.</p><p id="c8f3" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">It’s important that Niobe doesn’t unnecessarily modify existing mock data for fields that are unchanged and still valid, since doing so could overwrite manual tweaks that were made to the JSON by engineers or break existing tests that rely on this data. To avoid this, we provide the LLM with a diff of what changed in the query, and tuned the prompt to focus on that diff and avoid making spurious changes to unrelated fields.</p><p id="090b" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">Finally, each client codebase includes an automated check that ensures mock version hashes are up to date when code is submitted. This provides a guarantee that all generated mocks stay in sync with queries as they evolve over time. When engineers encounter these validation failures, they just re-run code generation locally — no manual updates required.</p><h2 id="4eba" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">Conclusion</h2><p id="9c98" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl"><em class="pk">“@generateMock has significantly sped up my local development and made working with local data much more enjoyable.” — Senior Software Engineer</em></p><p id="1b9a" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">By integrating highly contextualized LLMs — informed by the GraphQL schema, product context, and UX designs — directly into existing GraphQL tooling, we’ve unlocked the ability to generate valid and realistic mock data while eliminating the need for engineers to manually hand-write and maintain mocks. The directive-driven approach of @generateMock and @respondWithMock allows engineers to build clients before the server implementation is complete while keeping them in their focus loops and providing a guarantee that mock data stays in sync as queries evolve.</p><p id="a7f4" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">In just the past few months, Airbnb engineers have generated and merged over 700 mocks across iOS, Android, and Web using @generateMock, and we plan to roll out internal support for backend services soon. These tools have fundamentally changed how engineers mock GraphQL data for tests and prototypes at Airbnb, allowing them to focus on building product features rather than crafting and maintaining mock data.</p><h2 id="e3ec" class="pz qa jb bg qb qc qd qe gy qf qg qh ha qi qj qk ql qm qn qo qp qq qr qs qt qu bl">Acknowledgments</h2><p id="c39d" class="pw-post-body-paragraph op oq jb or b os qv ou ov ow qw oy oz hb qx pb pc he qy pe pf hh qz ph pi pj id bl">Special thanks to Raymond Wang and Virgil King for their contributions bringing @generateMock support to Web and Android clients, as well as to many other engineers and teams at Airbnb who participated in design reviews, built supporting infrastructure, and provided usage feedback.</p><p id="2a43" class="pw-post-body-paragraph op oq jb or b os ot ou ov ow ox oy oz hb pa pb pc he pd pe pf hh pg ph pi pj id bl">Does this type of work interest you? Check out our open roles <a class="ah ib" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">here</a>.</p></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/graphql-data-mocking-at-scale-with-llms-and-generatemock-30b380f12bd6</link>
      <guid>https://medium.com/airbnb-engineering/graphql-data-mocking-at-scale-with-llms-and-generatemock-30b380f12bd6</guid>
      <pubDate>Thu, 30 Oct 2025 18:01:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[From Static Rate Limiting to Adaptive Traffic Management in Airbnb’s Key-Value Store]]></title>
      <description><![CDATA[<div><div></div><p id="c4c3" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">How Airbnb hardened Mussel, our key-value store, with smarter traffic controls to stay fast and reliable during traffic spikes.</p><figure class="pg ph pi pj pk pl pd pe paragraph-image"><div role="button" tabindex="0" class="pm pn fr po bi pp">Press enter or click to view image in full size<div class="pd pe pf"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*kBx_8QLd7El4TZ2nrouYUw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="9e45" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">By <a class="ah ib" href="https://www.linkedin.com/in/shravangaonkar/" rel="noopener ugc nofollow" target="_blank">Shravan Gaonkar</a>, <a class="ah ib" href="https://www.linkedin.com/in/caseygetz/" rel="noopener ugc nofollow" target="_blank">Casey Getz</a>, <a class="ah ib" href="https://www.linkedin.com/in/wonheec/" rel="noopener ugc nofollow" target="_blank">Wonhee Cho</a></p><h2 id="497e" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl">Introduction</h2><p id="6cce" class="pw-post-body-paragraph oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc id bl">Every request lookup on Airbnb, from stays, experiences, and services search to customer support inquiries ultimately hits <a class="ah ib" rel="noopener" href="https://medium.com/airbnb-engineering/mussel-airbnbs-key-value-store-for-derived-data-406b9fa1b296" data-discover="true">Mussel</a>, our multi-tenant key-value store for derived data. Mussel operates as a proxy service, deployed as a fleet of stateless dispatchers — each a Kubernetes pod. On a typical day, this fleet handles millions of predictable point and range reads. During peak events, however, it must absorb several-fold higher volume, terabyte-scale bulk uploads, and sudden bursts from automated bots or DDoS attacks. Its ability to reliably serve this volatile mix of traffic is therefore critical to both the Airbnb user experience and the stability of the many services that power our platform.</p><p id="047c" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Given Mussel’s traffic volume and its role in core Airbnb flows, <a class="ah ib" href="https://en.wikipedia.org/wiki/Quality_of_service" rel="noopener ugc nofollow" target="_blank">quality of service</a> (QoS) is one of the product’s defining features. The first-generation QoS system was primarily an isolation tool. It relied on a Redis-backed counter, client quota based rate-limiter, that checked a caller’s requests per second (QPS) against a configurable fixed quota. The goal was to prevent a single misbehaving client from overwhelming the service and causing a complete outage. For this purpose, it was simple and effective.</p><p id="5968" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">However, as the service matured, our goal shifted from merely preventing meltdowns to maximizing goodput — that is, getting the most useful work done without degrading performance. A system of fixed, manually configured quotas can’t achieve this, as it can’t adapt in real time to shifting traffic patterns, new query shapes, or sudden threats like a DDoS attack. A truly effective QoS system needs to be adaptive, automatically exerting prioritized backpressure when it senses the system has reached its useful capacity.</p><p id="b3fc" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">To better match our QoS system to the realities of online traffic and maximize goodput, over time we evolved it to add several new layers.</p><ul class=""><li id="d805" class="oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc qs qt qu bl"><strong class="ok jc">Resource-aware rate control (RARC)</strong>: Charges each request in <em class="qv">request units</em> (RU) that reflect rows, bytes, and latency, not just counts.</li><li id="d39c" class="oi oj jb ok b ol qw on oo op qx or os hb qy ou ov he qz ox oy hh ra pa pb pc qs qt qu bl"><strong class="ok jc">Load shedding with criticality tiers</strong>: Guarantees that high-priority traffic (e.g., customer support, trust and safety) stays responsive when capacity evaporates.</li><li id="5a8f" class="oi oj jb ok b ol qw on oo op qx or os hb qy ou ov he qz ox oy hh ra pa pb pc qs qt qu bl"><strong class="ok jc">Hot-key detection &amp; DDoS mitigation</strong>: Detects skewed access patterns in real time and then shields the backend — whether the surge is legitimate or a DDoS burst — by caching or coalescing the duplicate requests before they reach the storage layer.</li></ul><p id="831f" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">What follows is an engineer’s view of how these layers were designed, deployed, and battle-tested, and why the same ideas may apply to any multi-tenant system that has outgrown simple QPS limits.</p><figure class="pg ph pi pj pk pl pd pe paragraph-image"><div role="button" tabindex="0" class="pm pn fr po bi pp">Press enter or click to view image in full size<div class="pd pe rb"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*XVDMQb8i2pQEUogiZipFbQ.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rc fm rd pd pe re rf bg b bh ab eb">Progression Timeline</figcaption></figure><h2 id="fe4c" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl">Background: Life with Client Quota Rate Limiter</h2><p id="c159" class="pw-post-body-paragraph oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc id bl">When Mussel launched, rate-limiting was entirely handled via simple QPS rate-limiting using a Redis-based distributed counter service. Each caller received a static, per-minute quota, and the dispatcher incremented a Redis key for every incoming request. If the key’s value exceeded the caller’s quota, the dispatcher returned an HTTP 429. The design was simple, predictable, and easy to operate.</p><p id="6614" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Two architectural details made this feasible. First, Mussel and its storage engine were tightly coupled; backend effort correlated reasonably well with the number of calls at the front door. Second, the traffic mix was modest in size and variety, so a single global limit per caller rarely caused trouble.</p><p id="1d8c" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">As adoption grew, two limitations became clear.</p><ol class=""><li id="6101" class="oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc rg qt qu bl"><strong class="ok jc">Cost variance:</strong> A one-row lookup and a 100,000-row scan were treated equally, even though their load on the backend differed by orders of magnitude. The system couldn’t distinguish high-value cheap work from low-value expensive work.</li><li id="3712" class="oi oj jb ok b ol qw on oo op qx or os hb qy ou ov he qz ox oy hh ra pa pb pc rg qt qu bl"><strong class="ok jc">Traffic skew:</strong> Per-caller rate limits provided isolation at the client level, but were blind to the data’s access pattern. When a single key became “hot” — for example, a popular listing accessed by thousands of different callers simultaneously — the aggregate traffic could overwhelm the underlying storage shard, even if each individual caller remained within its quota. This created a localized bottleneck that degraded performance for the entire cluster, impacting clients requesting completely unrelated data. Isolation by <em class="qv">caller</em> was insufficient to prevent this kind of resource contention.</li></ol><p id="d91b" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Addressing these gaps meant shifting from a <em class="qv">request-counting</em> mindset to a <em class="qv">resource-accounting</em> mindset and designing controls that reflect the real cost of each operation.</p><h2 id="058b" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl">Resource-aware rate control</h2><p id="9451" class="pw-post-body-paragraph oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc id bl">A fair quota system must account for the real work a request imposes on the storage layer. Resource-aware rate control (RARC) meets this need by charging operations in <em class="qv">request units</em> (RU) rather than raw requests per second.</p><p id="7050" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">A request unit blends four observable factors: fixed per-call overhead, rows processed, payload bytes, and — crucially — latency. Latency captures effects that rows and bytes alone miss: two one-megabyte reads can differ greatly in cost if one hits cache and the other triggers disk. In practice, we use a linear model. For both reads and writes, the cost is:</p><pre class="pg ph pi pj pk rh ri rj bq rk bc bl"><br />RU_read = 1 + w_r × bytes_read + w_l × latency_ms<br />RU_write = 6 + w_b × bytes_written / 4096 bytes + w_l × latency_msWeight factors w_r, w_b, and w_l come from load-test calibration <br />based on the compute, network and disk I/O. <br />bytes_read, bytes_written and latency is measured per request</pre><p id="f95e" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Although approximate, the formula separates operations whose surface metrics look similar yet load the backend very differently.</p><figure class="pg ph pi pj pk pl pd pe paragraph-image"><div role="button" tabindex="0" class="pm pn fr po bi pp">Press enter or click to view image in full size<div class="pd pe rq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*VuIeZSMzyRqXuXpfSzM_yg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rc fm rd pd pe re rf bg b bh ab eb">Impact of Latency on RU computation</figcaption></figure><p id="4d5b" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Each dispatcher continues to rely on rate-limiter for distributed counting, but the counter now represents request-unit tokens instead of raw QPS. At the start of every epoch, the dispatcher adds the caller’s static RU quota to a local token bucket and immediately debits that bucket by the RU cost of each incoming request. When the bucket is empty, the request is rejected with HTTP 419. Because all dispatchers follow the same procedure and epochs are short, their buckets remain closely aligned without additional coordination.</p><p id="dd56" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Adaptive protection is handled in the separate load-shedding layer; backend latency influences which traffic is dropped or delayed, not the size of the periodic RU refill. This keeps rate accounting straightforward — static quotas expressed in request units — while still reacting quickly when the storage layer shows signs of stress.</p><h2 id="9eb6" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl">Load shedding: Staying healthy when capacity evaporates or develops hotspots</h2><p id="0009" class="pw-post-body-paragraph oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc id bl">Rate limits based on request units excel at smoothing normal traffic, but they adjust on a scale of seconds. When the workload shifts faster — a bot floods a key, a shard stalls, or a batch job begins a full-table scan — those seconds are enough for queues to balloon and service-level objectives to slip. To bridge this reaction-time gap, Mussel uses a load-shedding safety net that combines three real-time signals: (1) traffic criticality, (2) a latency ratio, and (3) a CoDel-inspired queueing policy.</p><p id="6c87" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">The latency ratio is a ratio that serves as a real-time indicator of stress on the system stress. Each dispatcher computes this ratio by dividing the long-term p95 latency by the short-term p95 latency. A stable system has a ratio near 1.0; a value dropping towards 0.3 indicates that latency is rising sharply. When that threshold is crossed, the dispatcher temporarily increases the RU cost applied to a designated client class so that its token bucket drains faster and the request rate naturally backs off. If the ratio keeps falling, the same penalty can be expanded to additional classes until latency returns to a safe range.</p><p id="a04b" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">The estimate uses the constant-memory P² algorithm [1], requiring no raw sample storage or cross-node coordination.</p><figure class="pg ph pi pj pk pl pd pe paragraph-image"><div role="button" tabindex="0" class="pm pn fr po bi pp">Press enter or click to view image in full size<div class="pd pe rr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*GWv-z2hXM6RiW6sRJKzYCg.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*GWv-z2hXM6RiW6sRJKzYCg.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*GWv-z2hXM6RiW6sRJKzYCg.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*GWv-z2hXM6RiW6sRJKzYCg.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*GWv-z2hXM6RiW6sRJKzYCg.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*GWv-z2hXM6RiW6sRJKzYCg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*GWv-z2hXM6RiW6sRJKzYCg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*GWv-z2hXM6RiW6sRJKzYCg.png 640w, https://miro.medium.com/v2/resize:fit:720/1*GWv-z2hXM6RiW6sRJKzYCg.png 720w, https://miro.medium.com/v2/resize:fit:750/1*GWv-z2hXM6RiW6sRJKzYCg.png 750w, https://miro.medium.com/v2/resize:fit:786/1*GWv-z2hXM6RiW6sRJKzYCg.png 786w, https://miro.medium.com/v2/resize:fit:828/1*GWv-z2hXM6RiW6sRJKzYCg.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*GWv-z2hXM6RiW6sRJKzYCg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*GWv-z2hXM6RiW6sRJKzYCg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rc fm rd pd pe re rf bg b bh ab eb">Latency response over time and illustration of throttling</figcaption></figure><p id="31e2" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">The Control-Delay (CoDel) thread pool tackles the second hazard: queue buildup <em class="qv">inside the dispatcher itself</em> [2]. It monitors the time a request <em class="qv">waits</em> in the queue. If that sojourn time proves the system is already saturated, the request fails early, freeing up memory and threads for higher-priority work. An optional latency penalty can also be applied to RU accounting, charging more for queries from callers that persistently trigger the latency ratio.</p><p id="9012" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Together, these layers — criticality, a real-time latency ratio, and adaptive queueing — form a shield that lets guest-facing traffic ride out backend hiccups. In practice, this system has cut recovery times by about half and keeps dispatchers stable without human intervention.</p><h2 id="ba6a" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl">Hot-key detection and DDoS defence</h2><p id="c9a2" class="pw-post-body-paragraph oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc id bl">Request-unit limits and load shedding keep client usage fair, but they cannot stop a stampede of identical reads aimed at one record. Imagine a listing that hits the front page of a major news outlet: tens of thousands of guests refresh their browser, all asking for the same key. A misconfigured crawler — or a deliberate botnet — can generate the same access pattern, only faster. The result is shard overload, a full dispatcher queue, and rising latency for unrelated work.</p><p id="23b1" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Mussel neutralises this amplification with a three-stephot-key defence layer<strong class="ok jc">:</strong> real-time detection, local caching, and request coalescing.</p><h2 id="666e" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl">Real-time detection in constant space</h2><p id="fe3d" class="pw-post-body-paragraph oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc id bl">Every dispatcher streams incoming keys into an in-memory <em class="qv">top-k</em> counter. The counter is a variant of the Space-Saving algorithm [2] popularized in Brian Hayes’s “Britney Spears Problem” essay [4]. In just a few megabytes, it tracks approximate hit counts, maintains a frequency-ordered heap, and surfaces the hottest keys in real time in each individual dispatcher.</p><h2 id="5c9b" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl">Local caching and request coalescing</h2><p id="ce9d" class="pw-post-body-paragraph oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc id bl">When a key crosses the hot threshold, the dispatcher serves it from a process-local LRU cache. Entries expire after roughly three seconds, so they vanish as soon as demand cools; no global cache is required. A cache miss can still arrive multiple times in the same millisecond, so the dispatcher tracks in-flight reads for hot keys. New arrivals attach to the pending future; the first backend response then fans out to all waiters. In most cases only <em class="qv">one</em> request per hot key per dispatcher pod ever reaches the storage layer.</p><h2 id="d0fc" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl">Impact in production</h2><p id="6ebc" class="pw-post-body-paragraph oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc id bl">In a controlled DDoS drill that targeted a small set of keys at ≈ million-QPS scale, the hot-key layer collapsed the burst to a trickle — each dispatcher forwarded only an occasional request, well below the capacity of any individual shard — so the backend never felt the surge.</p><figure class="pg ph pi pj pk pl pd pe paragraph-image"><div role="button" tabindex="0" class="pm pn fr po bi pp">Press enter or click to view image in full size<div class="pd pe rs"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*DuwP-cqJEnQcbD64RGUTYA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*DuwP-cqJEnQcbD64RGUTYA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*DuwP-cqJEnQcbD64RGUTYA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*DuwP-cqJEnQcbD64RGUTYA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*DuwP-cqJEnQcbD64RGUTYA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*DuwP-cqJEnQcbD64RGUTYA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*DuwP-cqJEnQcbD64RGUTYA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*DuwP-cqJEnQcbD64RGUTYA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*DuwP-cqJEnQcbD64RGUTYA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*DuwP-cqJEnQcbD64RGUTYA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*DuwP-cqJEnQcbD64RGUTYA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*DuwP-cqJEnQcbD64RGUTYA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*DuwP-cqJEnQcbD64RGUTYA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*DuwP-cqJEnQcbD64RGUTYA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rc fm rd pd pe re rf bg b bh ab eb">Hotkeys detected and served from dispatcher cache in real time</figcaption></figure><h2 id="91e1" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl">Retrospective and key takeaways</h2><p id="1f7c" class="pw-post-body-paragraph oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc id bl">The journey from a single QPS counter to a layered, cost-aware QoS stack has reshaped how Mussel handles traffic and, just as importantly, how engineers think about fairness and resilience. A few themes surface when we look back across the stages described above.</p><p id="1000" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">The first is the value of early, visible impact. The initial release of request-unit accounting went live well before load shedding or hot-key defence. Soon after deployment it automatically throttled a caller whose range scans had been quietly inflating cluster latency. That early win validated the concept and built momentum for the deeper changes that followed.</p><p id="4c00" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">A second lesson is to prefer to keep control loops local. All the key signals — P² latency quantiles, the Space-Saving top-k counter, and CoDel queue delay — run entirely inside each dispatcher. Because no cross-node coordination is required, the system scales linearly and continues to protect capacity even if the control plane is itself under stress.</p><p id="5844" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Third, effective protection works ontwo different time-scales<strong class="ok jc">.</strong> Per-call RU pricing catches micro-spikes; the latency ratio and CoDel queue thresholds respond to macro slow-downs. Neither mechanism alone would have kept latency flat during the last controlled DDoS drill, but in concert they absorbed the shock and recovered within seconds.</p><p id="631b" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Finally, QoS is a living system. Traffic patterns evolve, back-end capabilities improve, and new workloads appear. Planned next steps include database-native resource groups and automatic quota tuning from thirty-day usage curves. The principles that guided this project — measure true cost, react locally and quickly, layer defences — form a durable template, but the implementation will continue to grow with the platform it protects.</p><p id="b2c3" class="pw-post-body-paragraph oi oj jb ok b ol om on oo op oq or os hb ot ou ov he ow ox oy hh oz pa pb pc id bl">Does this type of work interest you? We’re hiring, check out open roles <a class="ah ib" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">here</a>.</p><h2 id="54c4" class="pr ps jb bg pt pu pv pw gy px py pz ha qa qb qc qd qe qf qg qh qi qj qk ql qm bl"> References</h2><ol class=""><li id="bfd5" class="oi oj jb ok b ol qn on oo op qo or os hb qp ou ov he qq ox oy hh qr pa pb pc rg qt qu bl">Raj Jain and Imrich Chlamtac. 1985. The P² algorithm for dynamic calculation of quantiles and histograms without storing observations. <em class="qv">Communications of the ACM</em>, <strong class="ok jc">28</strong>(10), 1076–1085.<a class="ah ib" href="https://doi.org/10.1145/4372.4378" rel="noopener ugc nofollow" target="_blank"> https://doi.org/10.1145/4372.4378</a></li><li id="57ac" class="oi oj jb ok b ol qw on oo op qx or os hb qy ou ov he qz ox oy hh ra pa pb pc rg qt qu bl">Erik D. Demaine, Alejandro López-Ortiz, and J. Ian Munro. 2002. Frequency estimation of internet packet streams with limited space. In <em class="qv">Algorithms — ESA 2002: 10th Annual European Symposium</em>, Rome, Italy, September 17–21, 2002. Rolf H. Möhring and Rajeev Raman (Eds.). <em class="qv">Lecture Notes in Computer Science</em>, Vol. <strong class="ok jc">2461</strong>. Springer, 348–360.</li><li id="ff4a" class="oi oj jb ok b ol qw on oo op qx or os hb qy ou ov he qz ox oy hh ra pa pb pc rg qt qu bl">Kathleen M. Nichols and Van Jacobson. 2012. Controlling queue delay. <em class="qv">Communications of the ACM</em>, <strong class="ok jc">55</strong>(7), 42–50.<a class="ah ib" href="https://doi.org/10.1145/2209249.2209264" rel="noopener ugc nofollow" target="_blank"> https://doi.org/10.1145/2209249.2209264</a></li><li id="b450" class="oi oj jb ok b ol qw on oo op qx or os hb qy ou ov he qz ox oy hh ra pa pb pc rg qt qu bl">Brian Hayes. 2008. Computing science: The Britney Spears problem. <em class="qv">American Scientist</em>, <strong class="ok jc">96</strong>(4), 274–279. <a class="ah ib" href="https://www.americanscientist.org/article/the-britney-spears-problem" rel="noopener ugc nofollow" target="_blank">https://www.americanscientist.org/article/the-britney-spears-problem</a></li></ol></div>]]></description>
      <link>https://medium.com/airbnb-engineering/from-static-rate-limiting-to-adaptive-traffic-management-in-airbnbs-key-value-store-29362764e5c2</link>
      <guid>https://medium.com/airbnb-engineering/from-static-rate-limiting-to-adaptive-traffic-management-in-airbnbs-key-value-store-29362764e5c2</guid>
      <pubDate>Thu, 09 Oct 2025 18:01:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Building a Next-Generation Key-Value Store at Airbnb]]></title>
      <description><![CDATA[<div><div></div><p id="755a" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">By <a class="ah hv" href="https://www.linkedin.com/in/shravangaonkar/" rel="noopener ugc nofollow" target="_blank">Shravan Gaonkar</a>, <a class="ah hv" href="https://www.linkedin.com/in/chandramoulir/" rel="noopener ugc nofollow" target="_blank">Chandramouli Rangarajan</a>, <a class="ah hv" href="https://www.linkedin.com/in/yanhan-zhang/" rel="noopener ugc nofollow" target="_blank">Yanhan Zhang</a></p><p id="cf74" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">How we completely rearchitected Mussel, our storage engine for derived data, and lessons learned from the migration from Mussel V1 to V2.</p><figure class="pa pb pc pd pe pf ox oy paragraph-image"><div role="button" tabindex="0" class="pg ph fr pi bi pj">Press enter or click to view image in full size<div class="ox oy oz"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*sUS0d9nGKa-WQupVS8Wb4g.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="52e6" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">Airbnb’s core key-value store, internally known as Mussel, bridges offline and online workloads, providing highly scalable bulk load capabilities combined with single-digit millisecond reads.</p><p id="ac4d" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">Since first writing about Mussel in a 2022 <a class="ah hv" rel="noopener" href="https://medium.com/airbnb-engineering/mussel-airbnbs-key-value-store-for-derived-data-406b9fa1b296" data-discover="true">blog post</a>, we have completely deprecated the storage backend of the original system (what we now call Mussel v1) and have replaced it with a NewSQL backend which we are referring to as Mussel v2. Mussel v2 has been running successfully in production for a year, and we wanted to share why we undertook this rearchitecture, what the challenges were, and what benefits we got from it.</p><h2 id="1e62" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">Why rearchitect</h2><p id="1e80" class="pw-post-body-paragraph oc od iv oe b of qm oh oi oj qn ol om gv qo oo op gy qp or os hb qq ou ov ow hx bl">Mussel v1 reliably supported Airbnb for years, but new requirements — real-time fraud checks, instant personalization, dynamic pricing, and massive data — demand a platform that combines real-time streaming with bulk ingestion, all while being easy to manage.</p><h2 id="a1db" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">Key Challenges with v1</h2><p id="eb8e" class="pw-post-body-paragraph oc od iv oe b of qm oh oi oj qn ol om gv qo oo op gy qp or os hb qq ou ov ow hx bl">Mussel v2 solves a number of issues with v1, delivering a scalable, cloud-native key-value store with predictable performance and minimal operational overhead.</p><ul class=""><li id="b81f" class="oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow qr qs qt bl"><strong class="oe iw">Operational complexity:</strong> Scaling or replacing nodes required multi-step Chef scripts on EC2; v2 uses Kubernetes manifests and automated rollouts, reducing hours of manual work to minutes.</li><li id="6630" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Capacity &amp; hotspots:</strong> Static hash partitioning sometimes overloaded nodes, leading to latency spikes. V2’s dynamic range sharding and presplitting keep reads fast (p99 &lt; 25ms), even for 100TB+ tables.</li><li id="ea1b" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Consistency flexibility:</strong> v1 offered limited consistency control. v2 lets teams choose between immediate or eventual consistency based on their SLA needs.</li><li id="18f7" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Cost &amp; Transparency:</strong> Resource usage in v1 was opaque. v2 adds namespace tenancy, quota enforcement, and dashboards, providing cost visibility and control.</li></ul><h2 id="8a3c" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">New architecture</h2><figure class="pa pb pc pd pe pf ox oy paragraph-image"><div role="button" tabindex="0" class="pg ph fr pi bi pj">Press enter or click to view image in full size<div class="ox oy qz"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*yjXZcYHPpQdl-peEhfEAmA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*yjXZcYHPpQdl-peEhfEAmA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*yjXZcYHPpQdl-peEhfEAmA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*yjXZcYHPpQdl-peEhfEAmA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*yjXZcYHPpQdl-peEhfEAmA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*yjXZcYHPpQdl-peEhfEAmA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*yjXZcYHPpQdl-peEhfEAmA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*yjXZcYHPpQdl-peEhfEAmA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*yjXZcYHPpQdl-peEhfEAmA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*yjXZcYHPpQdl-peEhfEAmA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*yjXZcYHPpQdl-peEhfEAmA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*yjXZcYHPpQdl-peEhfEAmA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*yjXZcYHPpQdl-peEhfEAmA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*yjXZcYHPpQdl-peEhfEAmA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="49b4" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">Mussel v2 is a complete re-architecture addressing v1’s operational and scalability challenges. It’s designed to be automated, maintainable, and scalable, while ensuring feature parity and an easy migration for 100+ existing user cases.</p><h2 id="d4c1" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">Dispatcher</h2><p id="4e15" class="pw-post-body-paragraph oc od iv oe b of qm oh oi oj qn ol om gv qo oo op gy qp or os hb qq ou ov ow hx bl">In Mussel v2, the Dispatcher is a stateless, horizontally-scalable Kubernetes service that replaces the tightly coupled, protocol-specific design of v1. It translates client API calls into backend queries/mutations, supports dual-write and shadow-read modes for migration, manages retries and rate limits, and integrates with Airbnb’s service mesh for security and service discovery.</p><p id="962e" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl"><strong class="oe iw">Reads</strong> are simplified: Each dataname maps to a logical table, enabling optimized point lookups, range/prefix queries, and stale reads from local replicas to reduce latency. Dynamic throttling and prioritization maintain performance under changing traffic.</p><p id="1bd3" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl"><strong class="oe iw">Writes</strong> are persisted in Kafka for durability first, with the Replayer and Write Dispatcher applying them in order to the backend. This event-driven model absorbs bursts, ensures consistency, and removes v1’s operational overhead. Kafka also underpins upgrades, bootstrapping, and migrations until CDC and snapshotting mature.</p><p id="6818" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">The architecture suits derived data and replay-heavy use cases today, with a long-term goal of shifting ingestion and replication fully to the distributed backend database to bring down latency and simplify operations.</p><p id="4301" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">Bulk load<strong class="oe iw"><br /></strong>Bulk load remains essential for moving large datasets from offline warehouses into Mussel for low-latency queries. v2 preserves v1 semantics, supporting both “merge” (add to existing tables) and “replace” (swap datasets) semantics.</p><p id="0ed1" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">To maintain a familiar interface, v2 keeps the existing Airflow-based onboarding and transforms warehouse data into a standardized format, uploading to S3 for ingestion. <a class="ah hv" href="https://airbnb.io/projects/airflow/" rel="noopener ugc nofollow" target="_blank">Airflow</a> is an open-source platform for authoring, scheduling, and monitoring data pipelines. Created at Airbnb, it lets users define workflows in code as directed acyclic graphs (DAGs), enabling quick iteration and easy orchestration of tasks for data engineers and scientists worldwide.</p><p id="c950" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">A stateless controller orchestrates jobs, while a distributed, stateful worker fleet (Kubernetes StatefulSets) performs parallel ingestion, loading records from S3 into tables. Optimizations — like deduplication for replace jobs, delta merges, and insert-on-duplicate-key-ignore — ensure high throughput and efficient writes at Airbnb scale.</p><h2 id="c13d" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">TTL</h2><p id="0fe5" class="pw-post-body-paragraph oc od iv oe b of qm oh oi oj qn ol om gv qo oo op gy qp or os hb qq ou ov ow hx bl">Automated data expiration (TTL) can help support data governance goals and storage efficiency. In v1, expiration relied on the storage engine’s compaction cycle, which struggled at scale.</p><p id="f544" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">Mussel v2 introduces a topology-aware expiration service that shards data namespaces into range-based subtasks processed concurrently by multiple workers. Expired records are scanned and deleted in parallel, minimizing sweep time for large datasets. Subtasks are scheduled to limit impact on live queries, and write-heavy tables use max-version enforcement with targeted deletes to maintain performance and data hygiene.</p><p id="fd55" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">These enhancements provide the same retention functionality as v1 but with far greater efficiency, transparency, and scalability, meeting Airbnb’s modern data platform demands and enabling future use cases.</p><h2 id="3f4c" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">The migration process</h2><h2 id="2cf0" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">Challenge</h2><p id="259f" class="pw-post-body-paragraph oc od iv oe b of qm oh oi oj qn ol om gv qo oo op gy qp or os hb qq ou ov ow hx bl">Mussel stores vast amounts of data and serves thousands of tables across a wide array of Airbnb services, sustaining mission-critical read and write traffic at high scale. Given the criticality of Mussel to Airbnb’s online traffic, our migration goal was straightforward but challenging: Move all data and traffic from Mussel v1 to v2 with zero data loss and no impact on availability to our customers.</p><h2 id="094a" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">Process</h2><p id="3093" class="pw-post-body-paragraph oc od iv oe b of qm oh oi oj qn ol om gv qo oo op gy qp or os hb qq ou ov ow hx bl">We adopted a blue/green migration strategy, but with notable complexities. Mussel v1 didn’t provide table-level snapshots or CDC streams, which are standard in many datastores. To bridge this gap, we developed a custom migration pipeline capable of bootstrapping tables to v2, selected by usage patterns and risk profiles. Once bootstrapped, dual writes were enabled on a per-table basis to keep v2 in sync as the migration progressed.</p><p id="8bf9" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">The migration itself followed several distinct stages:</p><ul class=""><li id="5cd4" class="oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow qr qs qt bl"><strong class="oe iw">Blue Zone:</strong> All traffic initially flowed to v1 (“Blue”). This provided a stable baseline as we migrated data behind the scenes.</li><li id="99de" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Shadowing (Green):</strong> Once tables were bootstrapped, v2 (“Green”) began shadowing v1 — handling reads/writes in parallel, but only v1 responded. This allowed us to check v2’s correctness and performance without risk.</li><li id="e4e6" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Reverse:</strong> After building confidence, v2 took over active traffic while v1 remained on standby. We built automatic circuit breakers and fallback logic: If v2 showed elevated error rates or lagged behind v1, we could instantly return traffic to v1 or revert to shadowing.</li><li id="d6ac" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Cutover:</strong> When v2 passed all checks, we completed the cutover on a dataname-by-dataname basis, with Kafka serving as a robust intermediary for write reliability throughout.</li></ul><p id="9ef1" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">To further de-risk the process, migration was performed one table at a time. Every step was reversible and could be fine-tuned per table or group of tables based on their risk profile. This granular, staged approach allowed for rapid iteration, safe rollbacks, and continuous progress without impacting the business.</p><h2 id="22ad" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">Migration pipeline</h2><figure class="pa pb pc pd pe pf ox oy paragraph-image"><div role="button" tabindex="0" class="pg ph fr pi bi pj">Press enter or click to view image in full size<div class="ox oy ra"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*4Q-yjBQu8jwWSv0pkkNIHQ.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="2bbc" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">As described in our previous blog post, the v1 architecture uses Kafka as a replication log — data is first written to Kafka, then consumed by the v1 backend. During the data migration to v2, we leveraged the same Kafka stream to maintain eventual consistency between v1 and v2.</p><p id="6e17" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">To migrate any given table from v1 to v2, we built a custom pipeline consisting of the following steps:</p><ol class=""><li id="587b" class="oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow rb qs qt bl"><strong class="oe iw">Source data sampling</strong>: We download backup data from v1, extract the relevant tables, and sample the data to understand its distribution.</li><li id="21dc" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow rb qs qt bl"><strong class="oe iw">Create pre-split table on v2</strong>: Based on the sampling results, we create a corresponding v2 table with a pre-defined shard layout to minimize data reshuffling during migration.</li><li id="dd3b" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow rb qs qt bl"><strong class="oe iw">Bootstrap</strong>: This is the most time-consuming step, taking hours or even days depending on table size. To bootstrap efficiently, we use Kubernetes StatefulSets to persist local state and periodically checkpoint progress.</li><li id="c992" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow rb qs qt bl"><strong class="oe iw">Checksum verification</strong>: We verify that all data from the v1 backup has been correctly ingested into v2.</li><li id="1a0f" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow rb qs qt bl"><strong class="oe iw">Catch-up</strong>: We apply any lagging messages that accumulated in Kafka during the bootstrap phase.</li><li id="cd75" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow rb qs qt bl"><strong class="oe iw">Dual writes</strong>: At this stage, both v1 and v2 consume from the same Kafka topic. We ensure eventual consistency between the two, with replication lag typically within tens of milliseconds.</li></ol><p id="b72c" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">Once data migration is complete and we enter dual write mode, we can begin the read traffic migration phase. During this phase, our dispatcher can be dynamically configured to serve read requests for specific tables from v1, while sending shadow requests to v2 for consistency checks. We then gradually shift to serving reads from v2, accompanied by reverse shadow requests to v1 for consistency checks, which also enables quick fallback to v1 responses if v2 becomes unstable. Eventually, we fully transition to serving all read traffic from v2.</p><h2 id="a02c" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">Lessons learned</h2><p id="0028" class="pw-post-body-paragraph oc od iv oe b of qm oh oi oj qn ol om gv qo oo op gy qp or os hb qq ou ov ow hx bl">Several key insights emerged from this migration:</p><ul class=""><li id="8227" class="oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow qr qs qt bl"><strong class="oe iw">Consistency complexity:</strong> Migrating from an eventually consistent (v1) to a strongly consistent (v2) backend introduced new challenges, particularly around write conflicts. Addressing these required features like write deduplication, hotkey blocking, and lazy write repair — sometimes trading off storage cost or read performance.</li><li id="b682" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Presplitting is critical:</strong> As we shifted from hash-based (v1) to range-based partitioning (v2), inserting large consecutive data could cause hotspots and disrupt our v2 backend. To prevent this, we needed to accurately sample the v1 data and presplit it into multiple shards based on v2’s topology, ensuring balanced ingestion traffic across backend nodes during data migration.</li><li id="af1e" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Query model adjustments:</strong> v2 doesn’t push down range filters as effectively, requiring us to implement client-side pagination for prefix and range queries.</li><li id="d9ea" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Freshness vs. cost:</strong> Different use cases required different tradeoffs. Some prioritized data freshness and used primary replicas for the latest reads, while others leveraged secondary replicas to balance staleness with cost and performance.</li><li id="de68" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Kafka’s role:</strong> Kafka’s proven stable p99 millisecond latency made it an invaluable part of our migration process.</li><li id="f7c9" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow qr qs qt bl"><strong class="oe iw">Building in flexibility:</strong> Customer retries and routine bulk jobs provided a safety net for the rare inconsistencies, and our migration design allowed for per-table stage assignments and instant reversibility — key for managing risk at scale.</li></ul><p id="e533" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">As a result, we migrated more than a petabyte of data across thousands of tables with zero downtime or data loss, thanks to a blue/green rollout, dual-write pipeline, and automated fallbacks — so the product teams could keep shipping features while the engine under them evolved.</p><h2 id="0148" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">Conclusion and next steps</h2><p id="9b23" class="pw-post-body-paragraph oc od iv oe b of qm oh oi oj qn ol om gv qo oo op gy qp or os hb qq ou ov ow hx bl">What sets Mussel v2 apart is the way it fuses capabilities that are usually confined to separate, specialized systems. In our deployment of Mussel V2, we observe that this system can simultaneously</p><ol class=""><li id="9f59" class="oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow rb qs qt bl">ingest tens of terabytes in bulk data upload,</li><li id="1ea7" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow rb qs qt bl">sustain 100 k+ streaming writes per second in the same cluster, and</li><li id="bf80" class="oc od iv oe b of qu oh oi oj qv ol om gv qw oo op gy qx or os hb qy ou ov ow rb qs qt bl">keep p99 reads under 25 ms</li></ol><p id="fc58" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">— all while giving callers a simple dial to toggle stale reads on a per-namespace basis. By pairing a NewSQL backend with a Kubernetes-native control plane, Mussel v2 delivers the elasticity of object storage, the responsiveness of a low-latency cache, and the operability of modern service meshes — rolled into one platform. Engineers no longer need to stitch together a cache, a queue, and a datastore to hit their SLAs; Mussel provides those guarantees out of the box, letting teams focus on product innovation instead of data plumbing.</p><p id="78b1" class="pw-post-body-paragraph oc od iv oe b of og oh oi oj ok ol om gv on oo op gy oq or os hb ot ou ov ow hx bl">Looking ahead, we’ll be sharing deeper insights into how we’re evolving quality of service (QoS) management within Mussel, now orchestrated cleanly from the Dispatcher layer. We’ll also describe our journey in optimizing bulk loading at scale — unlocking new performance and reliability wins for complex data pipelines. If you’re passionate about building large-scale distributed systems and want to help shape the future of data infrastructure at Airbnb, take a look at our <a class="ah hv" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">Careers page</a> — we’re always looking for talented engineers to join us on this mission.</p><h2 id="2ff4" class="pq pr iv bg ps pt pu pv gs pw px py gu pz qa qb qc qd qe qf qg qh qi qj qk ql bl">References</h2><ol class=""><li id="13de" class="oc od iv oe b of qm oh oi oj qn ol om gv qo oo op gy qp or os hb qq ou ov ow rb qs qt bl"><a class="ah hv" rel="noopener" href="https://medium.com/airbnb-engineering/mussel-airbnbs-key-value-store-for-derived-data-406b9fa1b296" data-discover="true">https://medium.com/airbnb-engineering/mussel-airbnbs-key-value-store-for-derived-data-406b9fa1b296</a></li></ol></div>]]></description>
      <link>https://medium.com/airbnb-engineering/building-a-next-generation-key-value-store-at-airbnb-0de8465ba354</link>
      <guid>https://medium.com/airbnb-engineering/building-a-next-generation-key-value-store-at-airbnb-0de8465ba354</guid>
      <pubDate>Wed, 24 Sep 2025 18:02:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Viaduct, Five Years On: Modernizing the Data-Oriented Service Mesh]]></title>
      <description><![CDATA[<div class="hx hw iq ir is"><div class="it bi"><figure class="iu it bi paragraph-image"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*q_owFEOLfQlioFXHP7UqBA.avif 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*q_owFEOLfQlioFXHP7UqBA.avif 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*q_owFEOLfQlioFXHP7UqBA.avif 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*q_owFEOLfQlioFXHP7UqBA.avif 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*q_owFEOLfQlioFXHP7UqBA.avif 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*q_owFEOLfQlioFXHP7UqBA.avif 1100w, https://miro.medium.com/v2/resize:fit:3840/format:webp/1*q_owFEOLfQlioFXHP7UqBA.avif 3840w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1920px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*q_owFEOLfQlioFXHP7UqBA.avif 640w, https://miro.medium.com/v2/resize:fit:720/1*q_owFEOLfQlioFXHP7UqBA.avif 720w, https://miro.medium.com/v2/resize:fit:750/1*q_owFEOLfQlioFXHP7UqBA.avif 750w, https://miro.medium.com/v2/resize:fit:786/1*q_owFEOLfQlioFXHP7UqBA.avif 786w, https://miro.medium.com/v2/resize:fit:828/1*q_owFEOLfQlioFXHP7UqBA.avif 828w, https://miro.medium.com/v2/resize:fit:1100/1*q_owFEOLfQlioFXHP7UqBA.avif 1100w, https://miro.medium.com/v2/resize:fit:3840/1*q_owFEOLfQlioFXHP7UqBA.avif 3840w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1920px" /></picture></figure></div><div class="ac ci"><div class="cp bi id ie if ig"><div><div><h2 id="580c" class="pw-subtitle-paragraph jv ix iy bg b jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk cx eb">A more powerful engine and a simpler API for our data-oriented mesh</h2><div></div><p id="bc6e" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">In November 2020 we <a class="ah hi" rel="noopener" href="https://medium.com/airbnb-engineering/taming-service-oriented-architecture-using-a-data-oriented-service-mesh-da771a841344" data-discover="true">published</a> a post about Viaduct, our data-oriented service mesh. Today, we’re excited to announce Viaduct is available as open-source software (OSS) at <a class="ah hi" href="https://github.com/airbnb/viaduct" rel="noopener ugc nofollow" target="_blank">https://github.com/airbnb/viaduct</a>.</p><p id="5de2" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">Before we talk about OSS, here’s a quick update on Viaduct’s adoption and evolution at Airbnb over the last five years. Since 2020, traffic through Viaduct has grown by a factor of eight. The number of teams hosting code in Viaduct has doubled to 130+ (with hundreds of weekly active developers). The codebase hosted by Viaduct has tripled to over <strong class="os iz">1.5M</strong> lines (plus about the same in test code). We’ve achieved all this while keeping operational overhead constant, halving incident-minutes, and keeping costs growing linearly with QPS.</p><h2 id="4dce" class="pj pk iy bg pl pm pn jy gs po pp kb gu pq pr ps pt pu pv pw px py pz qa qb qc bl">What’s the same?</h2><p id="a16b" class="pw-post-body-paragraph oq or iy os b jw qd ou ov jz qe ox oy gv qf pa pb gy qg pd pe hb qh pg ph pi hx bl">Three principles have guided Viaduct since day one and still anchor the project: a <strong class="os iz">central schema</strong> served by <strong class="os iz">hosted business logic</strong> via a <strong class="os iz">re-entrant</strong> API.</p><p id="b83b" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl"><strong class="os iz">Central schema <br /></strong>Viaduct serves our central schema: a single, integrated schema connecting all of our domains across the company. While that schema is developed in a <em class="qi">decentralized</em> manner by many teams, it’s one, highly connected graph. Over 75% of Viaduct requests are internal because Viaduct has become a “one‑stop” data-oriented mesh connecting developers to all of our data and capabilities.</p><p id="ff4d" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl"><strong class="os iz">Hosted business logic <br /></strong>From the beginning, we’ve encouraged teams to host their business logic directly in Viaduct. This runs counter to what many consider to be best practices in GraphQL, which is that GraphQL servers should be a thin layer over microservices that host the real business logic. We’ve created a serverless platform for hosting business logic, allowing our developers to focus on writing business logic rather than on operational issues. As noted by Katie, an engineer on our Media team:</p><blockquote class="qj qk ql"><p id="8de6" class="oq or qi os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">“As we migrate our media APIs into Viaduct, we’re looking forward to retiring a handful of standalone services. Centralizing everything means less overhead, fewer moving parts, and a much smoother developer experience!”</p></blockquote><p id="0ddf" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl"><strong class="os iz">Re-entrancy</strong></p><p id="5fa5" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">At the heart of our developer experience is what we call <em class="qi">re-entrancy</em>: Logic hosted on Viaduct composes with other logic hosted on Viaduct by issuing GraphQL fragments and queries. Re-entrancy has been crucial for maintaining modularity in a large codebase and avoiding classic monolith hazards.</p><h2 id="75f8" class="pj pk iy bg pl pm pn jy gs po pp kb gu pq pr ps pt pu pv pw px py pz qa qb qc bl">What’s changed?</h2><p id="bedb" class="pw-post-body-paragraph oq or iy os b jw qd ou ov jz qe ox oy gv qf pa pb gy qg pd pe hb qh pg ph pi hx bl">For most of Viaduct’s history, evolution has been bottom-up and reactive to immediate developer needs. We added capabilities incrementally, which helped us move fast, but also produced multiple ways to accomplish similar tasks (some well‑supported, others not) and created a confusing developer experience, especially for new teams. Another side-effect of this reactive approach has been a lack of architectural integrity. The interfaces between the layers of Viaduct, described in more detail below, are loose and often arbitrary, and the abstraction boundary between the Viaduct framework and the code that it hosts is weak. As a result, it has become increasingly difficult to make changes to Viaduct without disrupting our customer base.</p><p id="fb1e" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">To address these issues, over a year ago we launched a major initiative we call <strong class="os iz">“Viaduct Modern”</strong>, a ground-up overhaul of both the developer-facing API and the execution engine.</p><h2 id="6158" class="pj pk iy bg pl pm pn jy gs po pp kb gu pq pr ps pt pu pv pw px py pz qa qb qc bl">Tenant API</h2><p id="2ddb" class="pw-post-body-paragraph oq or iy os b jw qd ou ov jz qe ox oy gv qf pa pb gy qg pd pe hb qh pg ph pi hx bl">One driving principle of Viaduct Modern has been to simplify and rationalize the API we provide to developers in Viaduct, which we call the <strong class="os iz">“Tenant API”</strong>. The following diagram captures the decision tree one faced when deciding how to implement functionality in the old API:</p><figure class="qp qq qr qs qt it qm qn paragraph-image"><div role="button" tabindex="0" class="qu qv fr qw bi qx">Press enter or click to view image in full size<div class="qm qn qo"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*0Gujn6vAspuKh3ZuPjlK-g.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*0Gujn6vAspuKh3ZuPjlK-g.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*0Gujn6vAspuKh3ZuPjlK-g.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*0Gujn6vAspuKh3ZuPjlK-g.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*0Gujn6vAspuKh3ZuPjlK-g.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*0Gujn6vAspuKh3ZuPjlK-g.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*0Gujn6vAspuKh3ZuPjlK-g.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*0Gujn6vAspuKh3ZuPjlK-g.png 640w, https://miro.medium.com/v2/resize:fit:720/1*0Gujn6vAspuKh3ZuPjlK-g.png 720w, https://miro.medium.com/v2/resize:fit:750/1*0Gujn6vAspuKh3ZuPjlK-g.png 750w, https://miro.medium.com/v2/resize:fit:786/1*0Gujn6vAspuKh3ZuPjlK-g.png 786w, https://miro.medium.com/v2/resize:fit:828/1*0Gujn6vAspuKh3ZuPjlK-g.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*0Gujn6vAspuKh3ZuPjlK-g.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*0Gujn6vAspuKh3ZuPjlK-g.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rd fm re qm qn rf rg bg b bh ab eb">Viaduct’s original complex programming model</figcaption></figure><p id="f064" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">Each oval in this diagram represents a different mechanism for writing code. In contrast, the new API offers just two mechanisms: node resolvers and field resolvers.</p><figure class="qp qq qr qs qt it qm qn paragraph-image"><div role="button" tabindex="0" class="qu qv fr qw bi qx">Press enter or click to view image in full size<div class="qm qn rh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*xuL1H_ylQR5PU2eh2JuXEA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*xuL1H_ylQR5PU2eh2JuXEA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*xuL1H_ylQR5PU2eh2JuXEA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*xuL1H_ylQR5PU2eh2JuXEA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*xuL1H_ylQR5PU2eh2JuXEA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*xuL1H_ylQR5PU2eh2JuXEA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*xuL1H_ylQR5PU2eh2JuXEA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*xuL1H_ylQR5PU2eh2JuXEA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*xuL1H_ylQR5PU2eh2JuXEA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*xuL1H_ylQR5PU2eh2JuXEA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*xuL1H_ylQR5PU2eh2JuXEA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*xuL1H_ylQR5PU2eh2JuXEA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*xuL1H_ylQR5PU2eh2JuXEA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*xuL1H_ylQR5PU2eh2JuXEA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rd fm re qm qn rf rg bg b bh ab eb">Viaduct Modern’s simpler model</figcaption></figure><p id="f1a8" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">The choice between the two is driven by the schema itself, not ad‑hoc distinctions based on a feature’s behavior. We unified the APIs for both resolver types wherever possible, which simplifies dev experience. After four years evolving the API in a use‑case‑driven manner, we distilled the best ideas into a single simple surface (and left the mistakes behind).</p><h2 id="e252" class="pj pk iy bg pl pm pn jy gs po pp kb gu pq pr ps pt pu pv pw px py pz qa qb qc bl">Tenant modularity</h2><p id="130e" class="pw-post-body-paragraph oq or iy os b jw qd ou ov jz qe ox oy gv qf pa pb gy qg pd pe hb qh pg ph pi hx bl">Strong abstraction boundaries are essential in any large codebase. Microservices achieve this via service definitions and RPC API boundaries; Viaduct achieves it via <strong class="os iz">modules</strong> plus <strong class="os iz">re‑entrancy</strong>.</p><p id="4660" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">Modularity in the central schema and hosted code has evolved. Initially, all we had was a vague set of conventions for organizing code into team-owned directories. There was no formal concept of a module, and schema and code were kept in separate source directories with unenforced naming conventions to connect the two. Over time, we evolved that into a more formal abstraction we call a “tenant module.” A tenant module is a unit of schema together with the code that implements that schema, and crucially, is owned by a single team. While we encourage rich graph‑level connections across modules, we <strong class="os iz">discourage direct code dependencies</strong> between modules. Instead, modules compose via GraphQL fragments and queries. Viaduct Modern extends and simplifies these re‑entrancy tools.</p><p id="99b0" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">Let’s look at an example. Imagine two teams, a “Core User” team that owns and manages the basic profile data of users, and then a “Messaging” team that operates a messaging platform for users to interact with each other. In our example, the Messaging team would like to define a <em class="qi">displayName</em> field on a <em class="qi">User</em>, which is used in their user interface. This would look something like this:</p><p id="93ba" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl"><strong class="os iz">Core User team</strong></p><pre class="qp qq qr qs qt ri rj rk bq rl bc bl">type User implements Node {<br />  id: ID!<br />  firstName: String<br />  lastName: String<br />  … <br />}</pre><pre class="hv ri rj rk bq rl bc bl">class UserResolver : Nodes.User() {<br />    @Inject<br />    val userClient: UserServiceClient@Inject<br />    val userResponseMapper: UserResponseMapperoverride suspend fun resolve(ctx.Context): User {<br />        val r = userClient.fetch(ctx.id)<br />        return userResponseMapper(r)<br />    } <br />}</pre><p id="9fce" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">This is the base definition of the <em class="qi">User</em> type that lives in the Core User team’s module. This base definition defines the first- and last-name fields (among many others), and it’s the Core User team’s responsibility to materialize those fields.</p><p id="975b" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl"><strong class="os iz">Messaging team</strong></p><pre class="qp qq qr qs qt ri rj rk bq rl bc bl">extend type User {<br />  displayName: String @resolver<br />}</pre><pre class="hv ri rj rk bq rl bc bl">@Resolver("firstName lastName")<br />class DisplayNameResolver : UserResolvers.DisplayName() {<br />    override suspend fun resolve(ctx: Context): String {<br />        val f = ctx.objectValue.getFirstName()<br />        val l = ctx.objectValue.getLastName()<br />        return "$f ${l.first()}."<br />    }<br />}</pre><p id="208c" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">The Messaging team can then extend the <em class="qi">User </em>type with the display name field, and also indicates that they intend to provide a resolver for it. The code has an <em class="qi">@Resolver </em>annotation that indicates which fields of the <em class="qi">User</em> object it needs to implement the <em class="qi">displayName</em> field. The Messaging team doesn’t need to understand which module these fields come from, and their code doesn’t depend on code from the Core User team. Instead, the Messaging team states their data needs in a declarative fashion.</p><h2 id="00aa" class="pj pk iy bg pl pm pn jy gs po pp kb gu pq pr ps pt pu pv pw px py pz qa qb qc bl">Framework modularity</h2><p id="7bac" class="pw-post-body-paragraph oq or iy os b jw qd ou ov jz qe ox oy gv qf pa pb gy qg pd pe hb qh pg ph pi hx bl">A major goal of Viaduct Modern has been to make the framework itself more modular. We want to enable faster improvements to Viaduct — especially regarding performance and reliability — without extensive changes to application code. Viaduct is composed of three main layers: the GraphQL execution engine, the tenant API, and the hosted application code. While these layers made sense, the interfaces between them were weak, making Viaduct difficult to update. The new design is focused on creating strong abstraction boundaries between these layers to improve flexibility and maintainability:</p><figure class="qp qq qr qs qt it qm qn paragraph-image"><div role="button" tabindex="0" class="qu qv fr qw bi qx">Press enter or click to view image in full size<div class="qm qn rh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*qvOOmI-Hs7-PMa52kSACqQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*qvOOmI-Hs7-PMa52kSACqQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*qvOOmI-Hs7-PMa52kSACqQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*qvOOmI-Hs7-PMa52kSACqQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*qvOOmI-Hs7-PMa52kSACqQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*qvOOmI-Hs7-PMa52kSACqQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*qvOOmI-Hs7-PMa52kSACqQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*qvOOmI-Hs7-PMa52kSACqQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*qvOOmI-Hs7-PMa52kSACqQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*qvOOmI-Hs7-PMa52kSACqQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*qvOOmI-Hs7-PMa52kSACqQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*qvOOmI-Hs7-PMa52kSACqQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*qvOOmI-Hs7-PMa52kSACqQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*qvOOmI-Hs7-PMa52kSACqQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rd fm re qm qn rf rg bg b bh ab eb">Viaduct Modern’s modular design</figcaption></figure><p id="d6bf" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">The most significant change is the boundary between the <strong class="os iz">engine</strong> and the developer-facing <strong class="os iz">tenant API</strong>. In the previous system, that boundary hardly existed. Viaduct Modern defines a strong <strong class="os iz">engine API</strong> whose core is a <strong class="os iz">dynamically‑typed</strong> representation of GraphQL values (input and output objects as simple maps from field name → value). The tenant API, by contrast, is <strong class="os iz">statically typed</strong>: we generate Kotlin classes for every GraphQL type in the central schema. In the new architecture, generated types are thin wrappers over the dynamic representation. The tenant API forms the bridge between the engine’s untyped world and tenants’ typed world.</p><p id="6ec9" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">This separation lets us evolve the engine in relative isolation (to improve latency, throughput, and reliability) and evolve the tenant API in relative isolation (to improve dev experience). Large changes will still cross the boundary, but as Viaduct Modern stabilizes, that should be rare.</p><h2 id="f5a1" class="pj pk iy bg pl pm pn jy gs po pp kb gu pq pr ps pt pu pv pw px py pz qa qb qc bl">Migration without a “big bang”</h2><p id="1f4c" class="pw-post-body-paragraph oq or iy os b jw qd ou ov jz qe ox oy gv qf pa pb gy qg pd pe hb qh pg ph pi hx bl">Viaduct Modern would be a non‑starter if it required a step‑function migration of a million+ lines of code. To enable gradual migration, we’re shipping <strong class="os iz">two tenant APIs</strong> side‑by‑side — the new <strong class="os iz">Modern</strong> API and the existing <strong class="os iz">Classic</strong> API — both on top of the new engine. This lets teams realize performance/cost wins from the engine immediately, while adopting the ergonomic wins of the Modern API over time.</p><p id="6bcd" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">Shipping two APIs has also improved the engine API design: building two tenant runtimes simultaneously forced us to keep the engine’s concerns clean and general. Over time, we expect to build additional tenant APIs on the engine (e.g., <strong class="os iz">TypeScript</strong>).</p><h2 id="42fe" class="pj pk iy bg pl pm pn jy gs po pp kb gu pq pr ps pt pu pv pw px py pz qa qb qc bl">Other improvements</h2><p id="28ed" class="pw-post-body-paragraph oq or iy os b jw qd ou ov jz qe ox oy gv qf pa pb gy qg pd pe hb qh pg ph pi hx bl">Viaduct has seen a lot of other improvements since 2020. The story is too long to be told in this post, but to list some of the highlights:</p><ul class=""><li id="e7da" class="oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi rr rs rt bl"><strong class="os iz">Observability.</strong> Hosting software from 100+ teams means our framework has to make ownership and attribution crystal clear. In the old system, there is no clear dividing line between tenant code and framework code, so our instrumentation includes a bit of guesswork in its attribution to parties. The new architecture draws a crisp boundary, which enables deeper, more accurate attribution.</li><li id="b057" class="oq or iy os b jw ru ou ov jz rv ox oy gv rw pa pb gy rx pd pe hb ry pg ph pi rr rs rt bl"><strong class="os iz">Build time.</strong> We’re schema‑first: Developers write schema as source, and Viaduct generates code to provide a strongly typed, ergonomic surface. At our scale, build time is a constant battle. Over the years we’ve made numerous investments to improve build time, including direct-to-bytecode code generation that bypasses lengthy compilation of generated code. We anticipate that the improved modularity in Viaduct Modern will keep build times in check as the codebase grows.</li><li id="3eaf" class="oq or iy os b jw ru ou ov jz rv ox oy gv rw pa pb gy rx pd pe hb ry pg ph pi rr rs rt bl"><strong class="os iz">Dispatcher.</strong> We run Viaduct as a horizontally-scaled Kubernetes app. To mitigate blast radius, we use a dispatcher that routes operations to deployment shards, applying <a class="ah hi" href="https://aws.amazon.com/blogs/architecture/shuffle-sharding-massive-and-magical-fault-isolation/" rel="noopener ugc nofollow" target="_blank">shuffle sharding</a>. It also simplifies isolating offline vs. online traffic and hosting experimental framework builds. (We don’t currently plan to open‑source the dispatcher as it’s tightly tied to Airbnb’s serving framework, but we may talk more about our strategies here in the future!)</li></ul><h2 id="1a46" class="pj pk iy bg pl pm pn jy gs po pp kb gu pq pr ps pt pu pv pw px py pz qa qb qc bl">Open-sourcing Viaduct</h2><p id="82f2" class="pw-post-body-paragraph oq or iy os b jw qd ou ov jz qe ox oy gv qf pa pb gy qg pd pe hb qh pg ph pi hx bl">Our intent from the start of Viaduct Modern was to open‑source it. We believe that setting out to build software for the world will result in higher quality software for ourselves. Also, as a significant consumer of open-source software, we feel an obligation to give back. Last but not least, while we’ve learned a lot by working with Airbnb developers, we think Viaduct can be massively improved by incorporating ideas and contributions from the wider community.</p><p id="d110" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">We’re open‑sourcing at an interesting moment. Viaduct is <strong class="os iz">mature and battle‑tested</strong> <em class="qi">and also</em> <strong class="os iz">new and evolving</strong>. The new engine is now in full production, while the new tenant API is still in alpha. We’ve implemented a small but robust kernel of the new API and are using it in a few (demanding) use cases. We’re investing heavily in the Modern API and migrating our own workloads to it, but it’s early days. By open‑sourcing early, we hope to grow a community who will shape the API with us together.</p><h2 id="cac3" class="pj pk iy bg pl pm pn jy gs po pp kb gu pq pr ps pt pu pv pw px py pz qa qb qc bl">Is Viaduct for you?</h2><p id="a659" class="pw-post-body-paragraph oq or iy os b jw qd ou ov jz qe ox oy gv qf pa pb gy qg pd pe hb qh pg ph pi hx bl">Although our use case proves Viaduct can scale to massive graphs, we think it’s also a great GraphQL server when you’re just starting. We’ve emphasized developer ergonomics from day one, and we believe Viaduct provides one of the best environments for building GraphQL solutions. Whether you’re operating a super-graph today or just kicking the tires, we’d love for you to try Viaduct Modern and tell us what works — and what doesn’t.</p><p id="13e6" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">—</p><p id="10ca" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl">Thanks to the entire Viaduct team, and especially Aileen Chen and Raymie Stata, for the tireless work on Viaduct Modern.</p></div></div></div><div class="ac ci rz sa sb sc" role="separator"><div class="hx hw iq ir is"><div class="ac ci"><div class="cp bi id ie if ig"><p id="f125" class="pw-post-body-paragraph oq or iy os b jw ot ou ov jz ow ox oy gv oz pa pb gy pc pd pe hb pf pg ph pi hx bl"><em class="qi">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/viaduct-five-years-on-modernizing-the-data-oriented-service-mesh-e66397c9e9a9</link>
      <guid>https://medium.com/airbnb-engineering/viaduct-five-years-on-modernizing-the-data-oriented-service-mesh-e66397c9e9a9</guid>
      <pubDate>Wed, 17 Sep 2025 19:01:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Taming Service-Oriented Architecture Using A Data-Oriented Service Mesh]]></title>
      <description><![CDATA[<div class="ac ci"><div class="cp bi id ie if ig"><div><div></div><p id="9ab6" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">Introducing Viaduct, Airbnb’s data-oriented service mesh</p><p id="73ec" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">By: Raymie Stata, Arun Vijayvergiya, Adam Miskiewicz</p></div></div><div class="oy bi"><figure class="oz pa pb pc pd oy bi paragraph-image"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*KUevf-1aGcLQyit3wzmh8w.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*KUevf-1aGcLQyit3wzmh8w.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*KUevf-1aGcLQyit3wzmh8w.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*KUevf-1aGcLQyit3wzmh8w.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*KUevf-1aGcLQyit3wzmh8w.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*KUevf-1aGcLQyit3wzmh8w.jpeg 1100w, https://miro.medium.com/v2/resize:fit:4800/format:webp/1*KUevf-1aGcLQyit3wzmh8w.jpeg 4800w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 100vw" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*KUevf-1aGcLQyit3wzmh8w.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*KUevf-1aGcLQyit3wzmh8w.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*KUevf-1aGcLQyit3wzmh8w.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*KUevf-1aGcLQyit3wzmh8w.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*KUevf-1aGcLQyit3wzmh8w.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*KUevf-1aGcLQyit3wzmh8w.jpeg 1100w, https://miro.medium.com/v2/resize:fit:4800/1*KUevf-1aGcLQyit3wzmh8w.jpeg 4800w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 100vw" /></picture></figure></div><div class="ac ci"><div class="cp bi id ie if ig"><p id="2856" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">At Hasura’s <a class="ah hi" href="https://hasura.io/enterprisegraphql/" rel="noopener ugc nofollow" target="_blank">Enterprise GraphQL Conf</a> on October 22, we presented Viaduct, what we’re calling a <em class="pf">data-oriented service mesh </em>that we believe will bring a step function improvement in the modularity of our microservices-based Service-Oriented Architecture (SOA). In this blog post, we describe the philosophy behind Viaduct and provide a rough sketch of how it works. Please <a class="ah hi" href="https://www.youtube.com/watch?v=xxk9MWCk7cM" rel="noopener ugc nofollow" target="_blank">watch the presentation</a> for a more detailed look.</p><h2 id="6983" class="pg ph iv bg pi pj pk pl gs pm pn po gu pp pq pr ps pt pu pv pw px py pz qa qb bl">Massive SOA Dependency Graphs</h2><p id="5ad8" class="pw-post-body-paragraph od oe iv of b og qc oi oj ok qd om on gv qe op oq gy qf os ot hb qg ov ow ox hx bl">For a while, <strong class="of iw">S</strong>ervice-<strong class="of iw">O</strong>riented <strong class="of iw">A</strong>rchitectures have been moving towards ever larger numbers of small microservices. Modern applications can consist of thousands to tens of thousands of microservices connected in unconstrained ways. As a result, it’s not uncommon to see dependency graphs like the following:</p><figure class="oz pa pb pc pd oy qh qi paragraph-image"><div role="button" tabindex="0" class="qk ql fr qm bi qn">Press enter or click to view image in full size<div class="qh qi qj"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*y2E1EmsS3paNhoChYGnJIw.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*y2E1EmsS3paNhoChYGnJIw.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*y2E1EmsS3paNhoChYGnJIw.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*y2E1EmsS3paNhoChYGnJIw.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*y2E1EmsS3paNhoChYGnJIw.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*y2E1EmsS3paNhoChYGnJIw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*y2E1EmsS3paNhoChYGnJIw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*y2E1EmsS3paNhoChYGnJIw.png 640w, https://miro.medium.com/v2/resize:fit:720/1*y2E1EmsS3paNhoChYGnJIw.png 720w, https://miro.medium.com/v2/resize:fit:750/1*y2E1EmsS3paNhoChYGnJIw.png 750w, https://miro.medium.com/v2/resize:fit:786/1*y2E1EmsS3paNhoChYGnJIw.png 786w, https://miro.medium.com/v2/resize:fit:828/1*y2E1EmsS3paNhoChYGnJIw.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*y2E1EmsS3paNhoChYGnJIw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*y2E1EmsS3paNhoChYGnJIw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="ee2f" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">This particular dependency graph happens to be from Airbnb, but it’s not uncommon. <a class="ah hi" href="https://twitter.com/werner/status/741673514567143424" rel="noopener ugc nofollow" target="_blank">Amazon</a>, <a class="ah hi" rel="noopener" href="https://medium.com/refraction-tech-everything/how-netflix-works-the-hugely-simplified-complex-stuff-that-happens-every-time-you-hit-play-3a40c9be254b" data-discover="true">Netflix</a>, and <a class="ah hi" href="https://eng.uber.com/microservice-architecture/" rel="noopener ugc nofollow" target="_blank">Uber</a> are examples of those that shared similarly tangled dependency graphs.</p><p id="c6df" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">These dependency graphs are reminiscent of <a class="ah hi" href="https://en.wikipedia.org/wiki/Spaghetti_code" rel="noopener ugc nofollow" target="_blank">spaghetti code</a>, just at the microservices level. Similar to how spaghetti code becomes harder and harder to modify over time, so does spaghetti SOA. To help manage the larger number of services inherent in a microservices-based architecture, we need organizing principles as well as technical measures to implement those principles. At Airbnb, we undertook an effort to find such principles and measures. Our investigations led us to the concept of a <em class="pf">data-oriented service mesh</em>,which we believe brings a new level of modularity to SOA.</p><h2 id="f767" class="pg ph iv bg pi pj pk pl gs pm pn po gu pp pq pr ps pt pu pv pw px py pz qa qb bl">Procedure- vs Data-Oriented Design</h2><p id="3847" class="pw-post-body-paragraph od oe iv of b og qc oi oj ok qd om on gv qe op oq gy qf os ot hb qg ov ow ox hx bl">Organizing large programs into modular units is not a new problem in software engineering. Up until the 1970s, the main paradigm of software organization focused on grouping code into procedures and procedures into modules. In this approach, modules publish a public API to be used by code outside of the module; behind this public API, modules hide their internal, helper procedures and other implementation details. Languages such as Pascal and C are based on this paradigm.</p><p id="3550" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">Starting in the ’80s, the paradigm shifted to organizing software primarily around data, not procedures. In this approach, modules define classes of objects that encapsulate an internal representation of an object accessed via a public API of methodson the object. Languages such as Simula and Clu pioneered this form of organization.</p><p id="d703" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">SOA is a step back to more procedure-oriented designs. Today’s microservice is a collection of procedural endpoints — a classic, 1970s-style module. We believe that SOA needs to evolve to support data-oriented design, and that this evolution can be enabled by transitioning our service mesh from a procedural orientation to a data orientation.</p><h2 id="25b3" class="pg ph iv bg pi pj pk pl gs pm pn po gu pp pq pr ps pt pu pv pw px py pz qa qb bl">Viaduct: A Data-Oriented Service Mesh</h2><p id="deb8" class="pw-post-body-paragraph od oe iv of b og qc oi oj ok qd om on gv qe op oq gy qf os ot hb qg ov ow ox hx bl">Central to modern, scalable SOA applications is a <em class="pf">service mesh</em> (e.g., <a class="ah hi" href="https://istio.io/" rel="noopener ugc nofollow" target="_blank">Istio</a>, <a class="ah hi" href="https://linkerd.io/" rel="noopener ugc nofollow" target="_blank">Linkerd</a>), which routes service invocations to instances of microservices that can handle them. The current industry standard for service meshes is to organize exclusively around remote procedure invocations without knowing anything about the data that makes up the application architecture. Our vision is to replace these procedure-oriented service meshes with service meshes organized around <em class="pf">data.</em></p><p id="d6a1" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">At Airbnb, we are using <a class="ah hi" href="https://graphql.org/" rel="noopener ugc nofollow" target="_blank">GraphQL</a>™️ to build a data-oriented service mesh called <em class="pf">Viaduct.</em> A Viaduct service mesh is defined in terms of a GraphQL schema consisting of:</p><ul class=""><li id="988a" class="od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox qt qu qv bl"><em class="pf">Types</em> (and <em class="pf">interfaces</em>) describing data managed within your service mesh</li><li id="5f91" class="od oe iv of b og qw oi oj ok qx om on gv qy op oq gy qz os ot hb ra ov ow ox qt qu qv bl"><em class="pf">Queries</em> (and <em class="pf">subscriptions</em>) providing means to access that data, which is abstracted from the service entry points that provide the data</li><li id="d683" class="od oe iv of b og qw oi oj ok qx om on gv qy op oq gy qz os ot hb ra ov ow ox qt qu qv bl"><em class="pf">Mutations </em>providing ways to update data, again abstracted from service entry points</li></ul><p id="92df" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">The types (and interfaces) in the schema define a single graph across all of the data managed within the service mesh. For example, at an eCommerce company, a service mesh’s schema may define a field <code class="de rb rc rd re b">productById(id: ID)</code> that returns results of type <code class="de rb rc rd re b">Product</code>. From this starting point, a single query allows a data consumer to navigate to information about the product’s manufacturer, e.g., <code class="de rb rc rd re b">productById { manufacturer }</code>; reviews of the product, e.g. <code class="de rb rc rd re b">productById { reviews }</code>; and even the authors of those reviews, e.g., <code class="de rb rc rd re b">productById { reviews { author } }</code>.</p><p id="c2cc" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">The data elements requested by such a query may come from many different microservices. In a procedure-oriented service mesh, the data consumer would need to take these services as explicit dependencies. In our data-oriented service mesh, it is the service mesh, i.e., Viaduct, not the data consumer, that knows which services provide which data element. Viaduct abstracts away the service dependencies from any single consumer.</p><h2 id="54be" class="pg ph iv bg pi pj pk pl gs pm pn po gu pp pq pr ps pt pu pv pw px py pz qa qb bl">Putting Schema at the Center</h2><p id="75f2" class="pw-post-body-paragraph od oe iv of b og qc oi oj ok qd om on gv qe op oq gy qf os ot hb qg ov ow ox hx bl">In our talk we discuss how, unlike other distributed GraphQL systems like <a class="ah hi" href="https://graphql-modules.com/" rel="noopener ugc nofollow" target="_blank">GraphQL Modules</a> or <a class="ah hi" href="https://www.apollographql.com/docs/federation/" rel="noopener ugc nofollow" target="_blank">Apollo Federation</a>, Viaduct deals with the schema as a single artifact and has implemented several primitives that allow us to keep a unified schema while still allowing for many teams to collaborate on that schema productively. As Viaduct replaces more and more of our underlying procedure-oriented service mesh, its schema captures the data managed by our application more and more completely. We have taken advantage of this “central schema,” as we call it, as a place to define the APIs of some of our microservices. In particular, we have started using GraphQL for the API of some microservices. For these microservices, their GraphQL schemas are defined as a subset of the central schema. In the future, we want to take this idea further, using the central schema to define the schema of data stored in our database.</p><p id="9362" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">Among other things, using the central schema to define our APIs and database schemas will solve one of the bigger challenges of large-scale SOA applications: data agility. In today’s SOA applications, a change to a database schema often needs to be manually reflected in the APIs of two, three, and sometimes even more layers of microservices before it can be exposed to client code. Such changes can require weeks of coordinating among multiple teams. By deriving service APIs and database schemas from a single, central schema, a database schema change like this can be propagated to client code with a single update.</p><h2 id="9d70" class="pg ph iv bg pi pj pk pl gs pm pn po gu pp pq pr ps pt pu pv pw px py pz qa qb bl">Going Serverless</h2><p id="9965" class="pw-post-body-paragraph od oe iv of b og qc oi oj ok qd om on gv qe op oq gy qf os ot hb qg ov ow ox hx bl">Often in large SOA applications, there are many stateless “derived-data” services and “backend-for-frontend” services that take raw data from lower-level services and transform it into data that’s more appropriate for presentation in clients. Stateless logic like this is a good fit for the serverless computing model, which eliminates the operational overhead of microservices altogether and instead hosts logic in a “cloud functions” fabric.</p><p id="de19" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">Viaduct has a mechanism for computing what we call “derived fields” using serverless cloud functions that operate on top ofthe graph without knowledge of the underlying services. These functions allow us to move transformational logic out of the service mesh and into stateless containers, keeping our graph clean and reducing the number and complexity of services we need.</p><h2 id="87c1" class="pg ph iv bg pi pj pk pl gs pm pn po gu pp pq pr ps pt pu pv pw px py pz qa qb bl">Conclusion</h2><p id="f052" class="pw-post-body-paragraph od oe iv of b og qc oi oj ok qd om on gv qe op oq gy qf os ot hb qg ov ow ox hx bl">Viaduct is built on <a class="ah hi" href="https://www.graphql-java.com/" rel="noopener ugc nofollow" target="_blank">graphql-java</a> and supports fine-grained field selection via GraphQL selection sets. It uses modern data-loading techniques, employs reliability techniques such as short-circuiting and soft dependencies, and implements an intra-request cache. Viaduct provides <em class="pf">data observability, </em>allowing us to understand, down to the field level, what services consume what data. As a GraphQL interface, Viaduct allows us to take advantage of a large ecosystem of open source tooling, including live IDEs, mock servers, and schema visualizers.</p><p id="ff8a" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">Viaduct started powering production workflows at Airbnb over a year ago. We started from scratch with a clean schema consisting of a handful of entities and have grown it to include 80 core entities that are able to power 75% of our modern API traffic.</p><p id="c352" class="pw-post-body-paragraph od oe iv of b og oh oi oj ok ol om on gv oo op oq gy or os ot hb ou ov ow ox hx bl">As mentioned in the introduction, more details on the motivation and technology behind Viaduct can be found in our <a class="ah hi" href="https://www.youtube.com/watch?v=xxk9MWCk7cM" rel="noopener ugc nofollow" target="_blank">presentation</a>.</p></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/taming-service-oriented-architecture-using-a-data-oriented-service-mesh-da771a841344</link>
      <guid>https://medium.com/airbnb-engineering/taming-service-oriented-architecture-using-a-data-oriented-service-mesh-da771a841344</guid>
      <pubDate>Tue, 16 Sep 2025 20:37:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Migrating Airbnb’s JVM Monorepo to Bazel]]></title>
      <description><![CDATA[<div><div></div><figure class="ny nz oa ob oc od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fl og bh oh">Press enter or click to view image in full size<div class="nv nw nx"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*rSkIrKYhc8Bwcv2xLE1mxA.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="2154" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">At Airbnb, we recently completed migrating our largest repo, the JVM monorepo, to Bazel. This repo contains <strong class="oq ip">tens of millions of lines</strong> of Java, Kotlin, and Scala code that power the vast array of backend services and data pipelines behind airbnb.com.</p><p id="9e00" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk"><strong class="oq ip">Migration in numbers (4.5 years of work):</strong></p><ul class=""><li id="e79c" class="oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi pj pk pl bk">Build CSAT: 38% → 68%</li><li id="fedc" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi pj pk pl bk"><strong class="oq ip">3–5x </strong>faster local build and test times</li><li id="ce7b" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi pj pk pl bk"><strong class="oq ip">2–3x </strong>faster IntelliJ syncs</li><li id="4558" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi pj pk pl bk"><strong class="oq ip">2–3x</strong> faster deploys to the development environment</li></ul><p id="d4f7" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">In this blog post, we’ll discuss the <strong class="oq ip">why</strong>, share some highlights on the <strong class="oq ip">how</strong>, and finish off with <strong class="oq ip">key learnings</strong>.</p><h1 id="60b9" class="pr ps io bf pt pu pv pw gl px py pz gn qa qb qc qd qe qf qg qh qi qj qk ql qm bk">Why Bazel?</h1><p id="c5d5" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">Before the migration, our JVM monorepo used Gradle as its build system. We decided to migrate toBazel because it offered three key advantages: speed, reliability, and a uniform build infrastructure layer.</p><h2 id="dc52" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Speed</h2><p id="344c" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk"><em class="qz">Bazel’s cacheable, portable actions allow us to scale performance with remote execution</em></p><p id="7413" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">In 2021, builds of large services often took &gt;20 minutes locally and pre-merge CI p90 was 35 minutes.</p><p id="c3a3" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Building with Gradle was near its limit. We had already vertically scaled to high-end AWS machines on CI and remote development machines for developers of large services. In CI, we also used heuristics to split project builds and tests across multiple machines. However, this was inefficient, because of machine underutilization and duplication of shared tasks.</p><figure class="rb rc rd re rf od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fl og bh oh">Press enter or click to view image in full size<div class="nv nw ra"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*y__bh8jYeKCNGmCW 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*y__bh8jYeKCNGmCW 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*y__bh8jYeKCNGmCW 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*y__bh8jYeKCNGmCW 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*y__bh8jYeKCNGmCW 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*y__bh8jYeKCNGmCW 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*y__bh8jYeKCNGmCW 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*y__bh8jYeKCNGmCW 640w, https://miro.medium.com/v2/resize:fit:720/0*y__bh8jYeKCNGmCW 720w, https://miro.medium.com/v2/resize:fit:750/0*y__bh8jYeKCNGmCW 750w, https://miro.medium.com/v2/resize:fit:786/0*y__bh8jYeKCNGmCW 786w, https://miro.medium.com/v2/resize:fit:828/0*y__bh8jYeKCNGmCW 828w, https://miro.medium.com/v2/resize:fit:1100/0*y__bh8jYeKCNGmCW 1100w, https://miro.medium.com/v2/resize:fit:1400/0*y__bh8jYeKCNGmCW 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="3926" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Bazel remote execution allowed us to scale to thousands of parallel actions. This was far more efficient than our sharding heuristics. Remote build execution (RBE) workers are also short-lived, which results in better machine utilization and cost efficiency. In addition, <a class="ag hb" href="https://blog.bazel.build/2021/04/07/build-without-the-bytes.html" rel="noopener ugc nofollow" target="_blank">Build without the Bytes</a> allows downloading only a subset of files, greatly reducing download volume (in Gradle, every cached artifact needs to be downloaded). Finally and most importantly, local builds are significantly faster thanks to RBE.</p><figure class="rb rc rd re rf od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fl og bh oh">Press enter or click to view image in full size<div class="nv nw rg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*qEJJw_dqvH--5nS1 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*qEJJw_dqvH--5nS1 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*qEJJw_dqvH--5nS1 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*qEJJw_dqvH--5nS1 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*qEJJw_dqvH--5nS1 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*qEJJw_dqvH--5nS1 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*qEJJw_dqvH--5nS1 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*qEJJw_dqvH--5nS1 640w, https://miro.medium.com/v2/resize:fit:720/0*qEJJw_dqvH--5nS1 720w, https://miro.medium.com/v2/resize:fit:750/0*qEJJw_dqvH--5nS1 750w, https://miro.medium.com/v2/resize:fit:786/0*qEJJw_dqvH--5nS1 786w, https://miro.medium.com/v2/resize:fit:828/0*qEJJw_dqvH--5nS1 828w, https://miro.medium.com/v2/resize:fit:1100/0*qEJJw_dqvH--5nS1 1100w, https://miro.medium.com/v2/resize:fit:1400/0*qEJJw_dqvH--5nS1 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="87b8" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">In addition, <a class="ag hb" href="https://docs.gradle.org/current/userguide/build_lifecycle.html#sec:configuration" rel="noopener ugc nofollow" target="_blank">Gradle configuration</a> of some large projects often took minutes due to it being single-threaded. Bazel analysis, in contrast, runs in parallel, in part because its configuration language, Starlark, is constrained to be side-effect-free.</p><h2 id="f84d" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Reliability</h2><p id="fae5" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk"><em class="qz">Bazel’s hermeticity ensures reliable, repeatable builds</em></p><p id="a4dd" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Gradle tasks have access to the full file system, which can lead to serious unintended consequences at scale. One example we ran into was when a developer updated a task to clean up recent files in the /tmp/ directory. This created a race condition with other Gradle tasks that used the /tmp/ directory and caused CI to fail when thousands of Gradle tasks had to be rerun.</p><p id="49d1" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Bazel solves this issue with sandboxing, which ensures that only specified inputs are available to a build action. If a file isn’t declared as an input, it simply doesn’t exist in the sandboxed environment.</p><p id="c418" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Gradle tasks also implicitly depend on the machine’s resources. Gradle builds run on local machines and CI machines of different sizes. This can lead to resource contention when a task is run on a smaller machine or when the cache is cold and thousands of tasks are run on the same machine.</p><p id="788e" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Remote build execution (RBE) solves this by running actions in identical containers with strict resource limits. We also configured both local and CI builds to use RBE, which greatly reduces environment differences.</p><h2 id="0901" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Shared infrastructure</h2><p id="edb9" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">Airbnb currently has a collection of language- and platform-specific repos, such as <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/adopting-bazel-for-web-at-scale-a784b2dbe325" data-discover="true">web</a>, <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/migrating-our-ios-build-system-from-buck-to-bazel-ddd6f3f25aa3" data-discover="true">iOS</a>, Python, and Go, all of which are now on Bazel. Unifying on Bazel enables a uniform build infrastructure layer across repos, which includes:</p><ul class=""><li id="6781" class="oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi pj pk pl bk">Remote caching</li><li id="9e67" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi pj pk pl bk">Remote build execution</li><li id="3680" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi pj pk pl bk">Affected targets calculation</li><li id="fbdf" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi pj pk pl bk">Instrumentation &amp; logging from the <a class="ag hb" href="https://bazel.build/docs/build-event-protocol" rel="noopener ugc nofollow" target="_blank">Build Event Protocol</a></li></ul><h1 id="e015" class="pr ps io bf pt pu pv pw gl px py pz gn qa qb qc qd qe qf qg qh qi qj qk ql qm bk">How did we migrate?</h1><h2 id="181c" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Proof of concept</h2><p id="f1c9" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">As a first milestone, we wanted to show that we could build a service and run its unit tests in Bazel. Because this was a proof of concept, we wanted to <strong class="oq ip">minimize disruption to engineers</strong>. Therefore, the Bazel build <em class="qz">co-existed </em>with Gradle<strong class="oq ip">. </strong>As a result, developers could choose between using Gradle or Bazel locally<em class="qz">.</em></p><p id="99d8" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">We needed to prove that developers would <em class="qz">choose</em> to opt in to using Bazel over Gradle. It wasn’t enough for Bazel to be faster, developers had to willingly opt in to using Bazel.</p><p id="93c2" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">For this proof of concept, we chose Airbnb’s GraphQL monolith platform, <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/taming-service-oriented-architecture-using-a-data-oriented-service-mesh-da771a841344" data-discover="true">Viaduct</a>, which had the following important properties:</p><ol class=""><li id="6766" class="oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi rh pk pl bk">It was one of Airbnb’s largest and most complex services. If we could migrate Viaduct to Bazel, then we could likely migrate the rest of the monorepo.</li><li id="7c20" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi rh pk pl bk">Slow builds were a major pain point, so Bazel could have a large impact.</li><li id="43bd" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi rh pk pl bk">Viaduct has 300 product engineers modifying its code every month, so improving Viaduct’s build speed would be a substantial productivity win.</li><li id="7f0b" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi rh pk pl bk">Because of (2) and (3) above, Viaduct’s core infrastructure team was eager to partner with us.</li></ol><p id="d207" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">To achieve a working Viaduct build with Bazel, we did two things. First, we had to port much of Viaduct’s build logic from Gradle to Bazel. Second, because we decided to maintain co-existing builds and the Gradle build graph was still changing, we decided to build an automated build file generator (which we’ll cover in detail in a separate section).</p><figure class="rb rc rd re rf od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fl og bh oh">Press enter or click to view image in full size<div class="nv nw rg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*sleFOnx1XS43xzAI 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*sleFOnx1XS43xzAI 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*sleFOnx1XS43xzAI 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*sleFOnx1XS43xzAI 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*sleFOnx1XS43xzAI 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*sleFOnx1XS43xzAI 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*sleFOnx1XS43xzAI 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*sleFOnx1XS43xzAI 640w, https://miro.medium.com/v2/resize:fit:720/0*sleFOnx1XS43xzAI 720w, https://miro.medium.com/v2/resize:fit:750/0*sleFOnx1XS43xzAI 750w, https://miro.medium.com/v2/resize:fit:786/0*sleFOnx1XS43xzAI 786w, https://miro.medium.com/v2/resize:fit:828/0*sleFOnx1XS43xzAI 828w, https://miro.medium.com/v2/resize:fit:1100/0*sleFOnx1XS43xzAI 1100w, https://miro.medium.com/v2/resize:fit:1400/0*sleFOnx1XS43xzAI 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="88f9" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Importantly, even though we were able to locally build the service 2–4x faster with Bazel, many developers did not yet want to switch.</p><p id="dafa" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">In talking with the service’s owners, we discovered a number of missing integrations and bugs. It took us an additional few months to address these pain points, after which Viaduct developers willingly switched from Gradle to Bazel.</p><h2 id="4c99" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Scaling builds and tests with Bazel</h2><p id="2bb6" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">The proof of concept showed that Bazel was superior to Gradle for one of Airbnb’s largest services and a large audience of developers. Now we wanted to scale it to the rest of Airbnb’s JVM monorepo.</p><p id="9c66" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">We decided to scale breadth-first, getting all of the repo compiling and testing in Bazel. Again, to minimize disruption, Bazel builds co-existed with Gradle, which had two important benefits.</p><p id="8ed7" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">First, developers could still use Bazel for local development and get most of its benefits even though their code was still built with Gradle for deployment. Second, we could always disable Bazel if it was negatively impacting developers. For example, when Bazel infrastructure like the remote cache or remote execution cluster experienced an incident, we could and did disable Bazel, letting users fall back to Gradle.</p><p id="0719" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">However, a major downside was that both Gradle and Bazel build graphs had to be maintained. Manually maintaining a Bazel build graph would have degraded the developer experience. As a result, we invested further in automated build file generation, so that developers didn’t need to manually maintain Bazel build files.</p><h2 id="1bc3" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Automated build file generation</h2><p id="b511" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">For our build file generator, we were heavily inspired by <a class="ag hb" href="https://github.com/bazel-contrib/bazel-gazelle" rel="noopener ugc nofollow" target="_blank">Gazelle</a>, which generates Bazel build files by parsing source files to build a dependency graph.</p><p id="6a99" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Although we considered extending Gazelle to support JVM languages, we had very strict performance requirements and needed to handle dependency cycles. This ultimately led us to build our own automated build file generator.</p><p id="260c" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Because we had to maintain <em class="qz">co-existing build graphs</em>, we needed to run the build file generator on every commit before merging into mainline. This meant it had to run as fast as possible to not significantly degrade the developer experience. To achieve this, we implemented external caching to speed up the automated build file generation.</p><p id="cdf7" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Similar to Gazelle, the build file generator parses Java, Kotlin, and Scala source files for package and import statements and symbol declarations to build a file-level dependency graph.</p><p id="8ac1" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">In CI, we publish a cached index of the repository at each mainline commit. When a user runs sync-configs, it downloads this cache and only re-scans directories which have changed since the merge base. This greatly improves performance for the common case where users only modify a small set of files.</p><p id="09a0" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">In addition, with this build file generator, we were able to support a more fine-grained build graph, which resulted in &gt;10x more Bazel targets than Gradle projects. This enabled faster builds through more parallel builds and less cache invalidation. However, one challenge of moving to a more fine-grained build graph is the possibility of introducing compilation cycles; sync-configs is able to detect this automatically and merge compilation units when necessary.</p><figure class="rb rc rd re rf od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fl og bh oh">Press enter or click to view image in full size<div class="nv nw rg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*zmnMMjLRC2a-XCO0 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*zmnMMjLRC2a-XCO0 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*zmnMMjLRC2a-XCO0 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*zmnMMjLRC2a-XCO0 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*zmnMMjLRC2a-XCO0 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*zmnMMjLRC2a-XCO0 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*zmnMMjLRC2a-XCO0 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*zmnMMjLRC2a-XCO0 640w, https://miro.medium.com/v2/resize:fit:720/0*zmnMMjLRC2a-XCO0 720w, https://miro.medium.com/v2/resize:fit:750/0*zmnMMjLRC2a-XCO0 750w, https://miro.medium.com/v2/resize:fit:786/0*zmnMMjLRC2a-XCO0 786w, https://miro.medium.com/v2/resize:fit:828/0*zmnMMjLRC2a-XCO0 828w, https://miro.medium.com/v2/resize:fit:1100/0*zmnMMjLRC2a-XCO0 1100w, https://miro.medium.com/v2/resize:fit:1400/0*zmnMMjLRC2a-XCO0 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="276c" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Even after the migration, the build file generator remains in use. It improves the developer experience by automatically fixing build file configurations and removing unused dependencies. In contrast, when we were on Gradle, users manually maintained ~4,500 Gradle files, which led to unused dependencies, a bloated dependency graph, and slower builds with fewer cache hits.</p><h2 id="776b" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Porting build logic</h2><p id="b64b" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">In addition to JVM source files, our repo has a large amount of build logic such as code generation owned by multiple teams. These often took the form of Gradle plugins.</p><p id="d37b" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Because we now had automated build file generation, when porting the build logic, we also had to integrate it with the build file generator. Also, because we had more granular targets, a single line of Gradle config now might need to apply to 10+ Bazel build files.</p><p id="b45a" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">As a result, our build file generator architecture supported plugins similar to Gazelle extensions. These plugins were triggered by the presence of specific files such as Thrift or GraphQL files. These plugins could also generate new build targets such as codegen actions.</p><p id="4eeb" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">In some cases, the Gradle logic was manually added as a one-off or not easily inferred from the file structure. As a result, we also supported generator <em class="qz">directives</em> similar to Gazelle, such as adding dependencies or setting attributes.</p><p id="f540" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Initially, our team ported much of this build logic ourselves with minimal help from service owners. As Bazel adoption grew, owners of complex build logic were incentivized to migrate to Bazel, because it was faster and more reliable than Gradle. In the process, they often wrote their own build file generator plugins, highlighting the extensibility of our generator.</p><h2 id="5377" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Third-party library multi-version support</h2><p id="fd01" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">Another major issue we hit on the road to 100% compilation and testing with Bazel was multiple versions of the same third-party library.</p><p id="9bea" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Initially, we specified a single version of each library. The build file generator would add dependencies from this universe.</p><p id="3ddf" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">However, in Gradle, each sub-project within the monorepo could use different versions of third-party libraries. As a result, when compiling against a single version, compilation or testing could fail with a missing symbol.</p><p id="790c" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">To bring multi-version support to our Bazel system, we built tooling to generate multiple maven_install rules and added a custom <a class="ag hb" href="https://bazel.build/extending/aspects" rel="noopener ugc nofollow" target="_blank">aspect</a> to resolve conflicts at the target level.</p><figure class="rb rc rd re rf od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fl og bh oh">Press enter or click to view image in full size<div class="nv nw rg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*_U1LAeex52squvmt 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*_U1LAeex52squvmt 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*_U1LAeex52squvmt 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*_U1LAeex52squvmt 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*_U1LAeex52squvmt 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*_U1LAeex52squvmt 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*_U1LAeex52squvmt 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*_U1LAeex52squvmt 640w, https://miro.medium.com/v2/resize:fit:720/0*_U1LAeex52squvmt 720w, https://miro.medium.com/v2/resize:fit:750/0*_U1LAeex52squvmt 750w, https://miro.medium.com/v2/resize:fit:786/0*_U1LAeex52squvmt 786w, https://miro.medium.com/v2/resize:fit:828/0*_U1LAeex52squvmt 828w, https://miro.medium.com/v2/resize:fit:1100/0*_U1LAeex52squvmt 1100w, https://miro.medium.com/v2/resize:fit:1400/0*_U1LAeex52squvmt 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="46e8" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk"><em class="qz">Multiple versions of Guava in Bazel before we added conflict resolution</em></p><p id="e12a" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Once we had this capability, we systematically synced library versions from Gradle so that each build target’s classpath more closely matched its Gradle counterpart.</p><p id="060b" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">To learn more about our approach, see our <a class="ag hb" href="https://www.youtube.com/watch?v=Ui4YtqWhqYU" rel="noopener ugc nofollow" target="_blank">BazelCon 2022 talk</a>. Since giving this talk, we have made improvements like moving the resolution to analysis-time for better IDE support and adding more user-friendly tooling for updating libraries.</p><h2 id="dd26" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Migrating the deploys</h2><p id="785a" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">As we got the vast majority of projects building and passing unit tests with Bazel in CI, we began to focus on migrating deployments. The Bazel-built `.jar`s were not identical to the Gradle-built `.jar`s. As a result, we needed a strategy to ensure the deployments were safe. We started with services.</p><h2 id="cea7" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Services</h2><p id="5da9" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">To verify the correctness of deploying services with Bazel, we used startup and integration tests. Of ~700 services, ~100 encountered startup or integration test failures. The majority of failures were missing dependencies that were loaded via Java reflection, usually from config files or other files. As a result, we were able to fix a number of these issues by parsing files for classes that would be loaded via reflection, and then adding the required dependency.</p><p id="a5a8" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Another major source of errors was differing library versions, which could lead to missing symbol errors at runtime. In Gradle, users manually specified dependencies and their versions. However, in Bazel, build files were generated from source and dependencies were inferred from import statements, which didn’t specify the version. We solved many of these errors forcing third-party library versions to match those of the Gradle project.</p><p id="e1d7" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">After taking into account reflected classes and syncing versions, only a single-digit percentage of services hit production runtime issues that required more in-depth manual work to fix.</p><h2 id="5888" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Data pipelines and other projects</h2><p id="3902" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">In addition to services, we had 450 data pipelines and ~50 other projects that were deployed to either a Spark cluster or a Flink runtime.</p><p id="1dd6" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Similar to services, we were able to catch a number of issues using tests. In particular, data pipelines on Airbnb’s data engineering paved path have CI tests that run a small version of the Spark pipeline locally. For these ~400 paved-path pipelines, after passing CI tests, only about 3% had production issues at runtime. As a result, we were able to very quickly migrate the paved-path pipelines.</p><p id="4e9e" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">As with services, we had a few remaining deployables that were a bit more bespoke and had to be individually deployed and monitored to verify correctness.</p><h1 id="bdf2" class="pr ps io bf pt pu pv pw gl px py pz gn qa qb qc qd qe qf qg qh qi qj qk ql qm bk">What did we learn?</h1><h2 id="6067" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Customer partnership</h2><p id="8a96" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">Early in the migration, we identified key pilot services that had large opportunities for impact and an appetite to invest in migrating to a new build system. For example, Viaduct had complex build logic leading to slow builds and reliability issues. In addition, many developers contributed to the service, so improving their builds had a large impact on Airbnb’s developer experience.</p><p id="9a63" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Partnering with pilot teams was incredibly valuable. They were early adopters and made significant contributions in the form of reporting and debugging issues, profiling performance bottlenecks, and suggesting features. The pilot teams also became advocates and provided internal support, helping motivate the rest of the Airbnb developer community.</p><h2 id="7fe6" class="qs ps io bf pt gk qt dy gl gm qu ea gn go qv gp gq gr qw gs gt gu qx gv gw qy bk">Dangers of premature optimization</h2><p id="a4ab" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">This migration took 4.5 years. With the benefit of hindsight we think we could have drastically improved the migration timeline if we had <strong class="oq ip">migrated first, before improving</strong><em class="qz">.</em></p><p id="6e38" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Although increasing build granularity improved build times, it increased the time to migrate. Specifically, increased build granularity greatly increased the number of configuration files, making it much harder to manage configurations manually. This forced more functionality into automated build file generation, which increased its complexity.</p><p id="2ea1" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">If we had migrated first and <em class="qz">then</em> optimized the build granularity, we believe we could have migrated sooner, enabling users to get benefits from Bazel sooner and reducing the time spent maintaining two co-existing builds.</p><p id="be6f" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Similarly, build granularity also made it harder to match deploy `.jar`s between Gradle and Bazel. This led to spending more time testing deployments and fixing runtime issues.</p><p id="bab4" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">On a more positive note, we accelerated the migration by deciding to support multiple third-party library versions and implementing version resolution. This enabled us to sync versions from Gradle to Bazel, which fixed a large number of build and runtime issues.</p><p id="eaf0" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Towards the end, one major takeaway in the migration was, <strong class="oq ip">by default, we should try to imitate what was there before</strong><em class="qz">. </em>In our case, deviating from Gradle usually added technical risk, and should be carefully considered, especially its downstream consequences.</p><p id="dd50" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">As engineers, we often want to improve things. However, during migrations, improvements can have non-obvious consequences and potentially significantly slow down the migration.</p><h1 id="1aaa" class="pr ps io bf pt pu pv pw gl px py pz gn qa qb qc qd qe qf qg qh qi qj qk ql qm bk">🏁 Conclusion</h1><p id="8c8f" class="pw-post-body-paragraph oo op io oq b or qn ot ou ov qo ox oy go qp pa pb gr qq pd pe gu qr pg ph pi hq bk">After 4.5 years, we fully migrated Airbnb’s biggest repo from Gradle to Bazel, achieving:</p><ul class=""><li id="0c7d" class="oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi pj pk pl bk">Build CSAT: 38% → 68%</li><li id="d9ac" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi pj pk pl bk"><strong class="oq ip">3–5x </strong>faster local build and test times</li><li id="0598" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi pj pk pl bk"><strong class="oq ip">2–3x</strong> faster IntelliJ syncs</li><li id="2426" class="oo op io oq b or pm ot ou ov pn ox oy go po pa pb gr pp pd pe gu pq pg ph pi pj pk pl bk"><strong class="oq ip">2–3x</strong> faster deploys to the development environment</li></ul><p id="71f3" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Finally, now that several Airbnb repos are on Bazel, we’re able to share <a class="ag hb" href="https://www.youtube.com/watch?v=RpSVBtyoYCY" rel="noopener ugc nofollow" target="_blank">common infrastructure</a> such as remote build caching, remote build execution, affected targets calculation, and more.</p><p id="4215" class="pw-post-body-paragraph oo op io oq b or os ot ou ov ow ox oy go oz pa pb gr pc pd pe gu pf pg ph pi hq bk">Interested in helping us solve problems like these at Airbnb? Learn more about our open engineering roles <a class="ag hb" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">here</a>.</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/migrating-airbnbs-jvm-monorepo-to-bazel-33f90eda51ec</link>
      <guid>https://medium.com/airbnb-engineering/migrating-airbnbs-jvm-monorepo-to-bazel-33f90eda51ec</guid>
      <pubDate>Wed, 13 Aug 2025 19:01:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Seamless Istio Upgrades at Scale]]></title>
      <description><![CDATA[<div><div><h2 id="0d9c" class="pw-subtitle-paragraph jl in io bf b jm jn jo jp jq jr js jt ju jv jw jx jy jz ka cq du"><strong class="am">How Airbnb upgrades tens of thousands of pods on dozens of Kubernetes clusters to new Istio versions</strong></h2><div></div><figure class="ok ol om on oo op oh oi paragraph-image"><div role="button" tabindex="0" class="oq or fl os bh ot">Zoom image will be displayed<div class="oh oi oj"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*44AVFDg8R66nWj4cAU8vCA.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*44AVFDg8R66nWj4cAU8vCA.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*44AVFDg8R66nWj4cAU8vCA.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*44AVFDg8R66nWj4cAU8vCA.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*44AVFDg8R66nWj4cAU8vCA.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*44AVFDg8R66nWj4cAU8vCA.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*44AVFDg8R66nWj4cAU8vCA.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*44AVFDg8R66nWj4cAU8vCA.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*44AVFDg8R66nWj4cAU8vCA.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*44AVFDg8R66nWj4cAU8vCA.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*44AVFDg8R66nWj4cAU8vCA.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*44AVFDg8R66nWj4cAU8vCA.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*44AVFDg8R66nWj4cAU8vCA.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*44AVFDg8R66nWj4cAU8vCA.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="d7f1" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">Airbnb has been running Istio® at scale since 2019. We support workloads running on both Kubernetes and virtual machines (using <a class="ag hb" href="https://istio.io/latest/docs/ops/deployment/vm-architecture/" rel="noopener ugc nofollow" target="_blank">Istio’s mesh expansion</a>). Across these two environments, we run tens of thousands of pods, dozens of Kubernetes clusters, and thousands of VMs. These workloads send tens of millions of QPS at peak through Istio. Our <a class="ag hb" href="https://www.youtube.com/watch?v=6kDiDQW5YXQ" rel="noopener ugc nofollow" target="_blank">IstioCon 2021 talk</a> describes our journey onto Istio and our <a class="ag hb" href="https://www.youtube.com/watch?v=1D8lg36ZNHs" rel="noopener ugc nofollow" target="_blank">KubeCon 2021 talk</a> goes into further detail on our architecture.</p><p id="694d" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">Istio is a foundational piece of our architecture, which makes ongoing maintenance and upgrades a challenge. Despite that, we have upgraded Istio a total of 14 times. This blog post will explore how the Service Mesh team at Airbnb safely upgrades Istio while maintaining high availability.</p><h2 id="a45a" class="pt pu io bf pv gk pw dy gl gm px ea gn go py gp gq gr pz gs gt gu qa gv gw qb bk">Challenges</h2><p id="989c" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">Airbnb engineers collectively run thousands of different workloads. We cannot reasonably coordinate the teams that own these, so our upgrades must function independently of individual teams. We also cannot monitor all of these at once, and so we must minimize risk through gradual rollouts.</p><p id="4070" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">With that in mind, we designed our upgrade process with the following goals:</p><ol class=""><li id="f3a2" class="pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps qh qi qj bk">Zero downtime for workloads and users. This is the <em class="qk">seamless</em> part of the upgrade — a workload owner doesn’t need to be in the loop for Istio upgrades.</li><li id="e8c5" class="pa pb io pc b jm ql pe pf jp qm ph pi go qn pk pl gr qo pn po gu qp pq pr ps qh qi qj bk">Gradual rollouts with the ability to control which workloads are upgraded or reverted.</li><li id="1141" class="pa pb io pc b jm ql pe pf jp qm ph pi go qn pk pl gr qo pn po gu qp pq pr ps qh qi qj bk">We must be able to roll back an upgrade across all workloads, without coordinating every workload team.</li><li id="110d" class="pa pb io pc b jm ql pe pf jp qm ph pi go qn pk pl gr qo pn po gu qp pq pr ps qh qi qj bk">All workloads should be upgraded within some defined time.</li></ol><h2 id="ff8b" class="pt pu io bf pv gk pw dy gl gm px ea gn go py gp gq gr pz gs gt gu qa gv gw qb bk">Architecture</h2><p id="458a" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">Our deployment consists of one management cluster, which runs Istiod and contains all workload configuration for the mesh (VirtualServices, DestinationRules, and so forth), and multiple workload clusters, which run user workloads. VMs run separately, but their Istio manifests are still deployed to the management cluster in their own namespaces. We use Sidecar mode exclusively, meaning that every workload runs <code class="cx qq qr qs qt b">istio-proxy</code> — we do not yet run <a class="ag hb" href="https://istio.io/latest/docs/ambient/overview/" rel="noopener ugc nofollow" target="_blank">Ambient</a>.</p><figure class="ok ol om on oo op oh oi paragraph-image"><div role="button" tabindex="0" class="oq or fl os bh ot">Zoom image will be displayed<div class="oh oi qu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*-qAuZZZxdsi-oJSO 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*-qAuZZZxdsi-oJSO 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*-qAuZZZxdsi-oJSO 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*-qAuZZZxdsi-oJSO 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*-qAuZZZxdsi-oJSO 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*-qAuZZZxdsi-oJSO 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*-qAuZZZxdsi-oJSO 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*-qAuZZZxdsi-oJSO 640w, https://miro.medium.com/v2/resize:fit:720/0*-qAuZZZxdsi-oJSO 720w, https://miro.medium.com/v2/resize:fit:750/0*-qAuZZZxdsi-oJSO 750w, https://miro.medium.com/v2/resize:fit:786/0*-qAuZZZxdsi-oJSO 786w, https://miro.medium.com/v2/resize:fit:828/0*-qAuZZZxdsi-oJSO 828w, https://miro.medium.com/v2/resize:fit:1100/0*-qAuZZZxdsi-oJSO 1100w, https://miro.medium.com/v2/resize:fit:1400/0*-qAuZZZxdsi-oJSO 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h2 id="02d7" class="pt pu io bf pv gk pw dy gl gm px ea gn go py gp gq gr pz gs gt gu qa gv gw qb bk">Upgrade Process</h2><p id="a897" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">At a high level, we follow <a class="ag hb" href="https://istio.io/latest/docs/setup/upgrade/canary/" rel="noopener ugc nofollow" target="_blank">Istio’s canary upgrade model</a>. This involves running two versions (or Istio revisions) of Istiod simultaneously: the current version and the new version that we are upgrading to. Both form one logical service mesh, so workloads connected to one Istiod can communicate with workloads connected to another Istiod and vice versa. Istiod versions are managed using different revision labels — for example, <code class="cx qq qr qs qt b">1–24–5</code> for Istio 1.24.5 and <code class="cx qq qr qs qt b">1–25–2</code> for Istio 1.25.2.</p><p id="4353" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">An upgrade involves both Istiod, the control plane, and <code class="cx qq qr qs qt b">istio-proxy</code>, the data plane sidecar, running on all pods and VMs. While Istio supports connecting an <a class="ag hb" href="https://istio.io/latest/docs/releases/supported-releases/#control-planedata-plane-skew" rel="noopener ugc nofollow" target="_blank">older istio-proxy to a newer Istiod</a>, we do not use this. Instead, we atomically roll out the new <code class="cx qq qr qs qt b">istio-proxy</code> version to a workload along with the configuration of which Istiod to connect to. For example, the <code class="cx qq qr qs qt b">istio-proxy</code> built for version 1.24 will only connect to 1.24’s Istiod and the <code class="cx qq qr qs qt b">istio-proxy</code> built for 1.25 will only connect to 1.25’s Istiod. This reduces a dimension of complexity during upgrades (cross-version data plane — control plane compatibility).</p><p id="803c" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">The first step of our upgrade process is to deploy the new Istiod, with a new revision label, onto the management cluster. Because all workloads are explicitly pinned to a revision, no workload will connect to this new Istiod, so this first step has no impact.</p><p id="02f1" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">The rest of the upgrade comprises all of the effort and risk — workloads are gradually shifted to run the new <code class="cx qq qr qs qt b">istio-proxy</code> version and connect to the new Istiod.</p><figure class="ok ol om on oo op oh oi paragraph-image"><div role="button" tabindex="0" class="oq or fl os bh ot">Zoom image will be displayed<div class="oh oi qu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*k7_eTtEPqBiDKM3t 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*k7_eTtEPqBiDKM3t 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*k7_eTtEPqBiDKM3t 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*k7_eTtEPqBiDKM3t 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*k7_eTtEPqBiDKM3t 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*k7_eTtEPqBiDKM3t 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*k7_eTtEPqBiDKM3t 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*k7_eTtEPqBiDKM3t 640w, https://miro.medium.com/v2/resize:fit:720/0*k7_eTtEPqBiDKM3t 720w, https://miro.medium.com/v2/resize:fit:750/0*k7_eTtEPqBiDKM3t 750w, https://miro.medium.com/v2/resize:fit:786/0*k7_eTtEPqBiDKM3t 786w, https://miro.medium.com/v2/resize:fit:828/0*k7_eTtEPqBiDKM3t 828w, https://miro.medium.com/v2/resize:fit:1100/0*k7_eTtEPqBiDKM3t 1100w, https://miro.medium.com/v2/resize:fit:1400/0*k7_eTtEPqBiDKM3t 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qv ff qw oh oi qx qy bf b bg ab du"><em class="qz">Multiple Istio revisions, with some workloads connected to different revisions.</em></figcaption></figure><h1 id="2fcc" class="ra pu io bf pv rb rc jo gl rd re jr gn rf rg rh ri rj rk rl rm rn ro rp rq rr bk">Rollout specification</h1><p id="6380" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">We control what version of <code class="cx qq qr qs qt b">istio-proxy</code> workloads run through a file called <code class="cx qq qr qs qt b">rollouts.yml</code>. This file specifies workload namespaces (as patterns) and the percentage distribution of Istio versions:</p><pre class="ok ol om on oo rs qt rt bp ru bb bk"># "production" is the default; anything not matching a different pattern will match this.<br />production:<br />  1-24-5: 100".*-staging":<br />  1-24-5: 75<br />  1-25-2: 25# A pinned namespace; our end-to-end verification workload.<br />istio-e2e:<br />  1-25-2: 100</pre><p id="664f" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">This spec dictates the desired state of all namespaces. A given namespace is first mapped to a bucket (based on the longest pattern that matches) and then a version is chosen based on the distribution for that bucket. The distribution applies at the namespace level, not the pod (or VM) level. For example,</p><pre class="ok ol om on oo rs qt rt bp ru bb bk">".*-staging":<br />  1-24-5: 75<br />  1-25-2: 25</pre><p id="a0ea" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">means that 75% of the namespaces with the suffix <code class="cx qq qr qs qt b">-staging</code>will be assigned to <code class="cx qq qr qs qt b">1–24–5</code> and the remaining 25% will be assigned to <code class="cx qq qr qs qt b">1–25–2</code>. This assignment is deterministic, using consistent hashing. The majority of our upgrade process involves updating <code class="cx qq qr qs qt b">rollouts.yml</code> and then monitoring.</p><p id="215a" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">This process allows us to selectively upgrade workloads. We can also upgrade environments separately and ensure that only a certain percentage of those environments are on the new version. This gives us time to bake an upgrade and learn of potential regressions.</p><p id="9f9e" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">The rest of this post will describe the mechanism through which a change to <code class="cx qq qr qs qt b">rollouts.yml</code> is applied to thousands of workloads, for both Kubernetes and VMs.</p><h1 id="67d5" class="ra pu io bf pv rb rc jo gl rd re jr gn rf rg rh ri rj rk rl rm rn ro rp rq rr bk">Kubernetes</h1><p id="3164" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">Each Istio revision has a corresponding <a class="ag hb" href="https://istio.io/latest/docs/setup/additional-setup/sidecar-injection/#automatic-sidecar-injection" rel="noopener ugc nofollow" target="_blank">MutatingAdmissionWebhook for sidecar injection</a> on every workload cluster. This webhook selects pods specifying the label <code class="cx qq qr qs qt b">istio.io/rev=&lt;revision&gt;</code> and injects the <code class="cx qq qr qs qt b">istio-proxy</code> and <code class="cx qq qr qs qt b">istio-init</code> containers into those pods. Notably, the <code class="cx qq qr qs qt b">istio-proxy</code> container contains the <code class="cx qq qr qs qt b">PROXY_CONFIG</code> environment variable, which sets the <code class="cx qq qr qs qt b">discoveryAddress</code> to the Istiod revision. This is how the <code class="cx qq qr qs qt b">istio-proxy</code>version and the configuration for which Istiod to connect to are deployed atomically — entirely by the sidecar injector.</p><p id="eee6" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">Every workload’s Deployment has this revision label. For example, a workload configured to use Istio 1.24.5 will have the label <code class="cx qq qr qs qt b">istio.io/rev=1–24–5</code>in its pod template; thus pods for that Deployment will be mutated by the MutatingAdmissionWebhook for Istio 1.24.5.</p><p id="3ac7" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">This setup is the standard method of upgrading Istio, but requires that every Deployment specifies a revision label. To perform an upgrade across thousands of workloads, every team would have to update this label and deploy their workload. We could neither perform a rollback across all workloads nor reasonably expect an upgrade to complete to 100%, both for the same reason — relying on every workload to deploy.</p><h2 id="5c79" class="pt pu io bf pv gk pw dy gl gm px ea gn go py gp gq gr pz gs gt gu qa gv gw qb bk">Krispr</h2><p id="f047" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">To avoid having to update workloads individually, a workload’s configuration never directly specifies the revision label in source code. Instead, we use <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/a-krispr-approach-to-kubernetes-infrastructure-a0741cff4e0c" data-discover="true">Krispr, a mutation framework built in-house</a>, to inject the revision label. Krispr gives us the ability to decouple infrastructure component upgrades from workload deployments.</p><p id="88db" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">Airbnb workloads that run on Kubernetes use an internal API to define their workload, instead of specifying Kubernetes manifests. This abstraction is then compiled into Kubernetes manifests during CI. Krispr runs as part of this compilation and mutates those Kubernetes manifests. One of those mutations injects the Istio revision label into the pod specification of each Deployment, reading <code class="cx qq qr qs qt b">rollouts.yml</code>to decide which label to inject. If a team sees any issue with their workload when they deploy, they can roll back and thus also roll back the Istio upgrade — all without involving the Service Mesh team.</p><p id="4c77" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">In addition, Krispr runs during pod admission. If a pod is being admitted from a Deployment that is more than two weeks old, Krispr will re-mutate the pod and accordingly update the pod’s revision label if needed. Combined with the fact that our Kubernetes nodes have a maximum lifetime of two weeks, thus ensuring that any given pod’s maximum lifetime is also two weeks, we can guarantee that an Istio upgrade completes. A majority of workloads will be upgraded when they deploy (during the Krispr run in CI) and for those that don’t deploy regularly, the natural pod cycling and re-mutation will ensure they are upgraded in at most four weeks.</p><p id="2247" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">In summary, per workload:</p><ol class=""><li id="2519" class="pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps qh qi qj bk">During CI, Krispr mutates the Kubernetes manifests of a workload to add the Istio revision label, based on <code class="cx qq qr qs qt b">rollouts.yml</code>.</li><li id="c6bf" class="pa pb io pc b jm ql pe pf jp qm ph pi go qn pk pl gr qo pn po gu qp pq pr ps qh qi qj bk">When a pod is admitted to a cluster, Krispr will re-mutate the pod if its Deployment is more than two weeks old and update the Istio revision label if needed.</li><li id="3821" class="pa pb io pc b jm ql pe pf jp qm ph pi go qn pk pl gr qo pn po gu qp pq pr ps qh qi qj bk">The revision-specific Istio MutatingAdmissionWebhook will mutate the pod by injecting the sidecar and associated <code class="cx qq qr qs qt b">discoveryAddress</code>.</li></ol><h1 id="94d4" class="ra pu io bf pv rb rc jo gl rd re jr gn rf rg rh ri rj rk rl rm rn ro rp rq rr bk">Virtual machines</h1><p id="ac8f" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">On VMs, we deploy an artifact that contains <code class="cx qq qr qs qt b">istio-proxy</code>, a script to run <code class="cx qq qr qs qt b">istio-iptables</code> (similar to the <code class="cx qq qr qs qt b">istio-init</code> container), and the Istiod <code class="cx qq qr qs qt b">discoveryAddress</code>. By packaging <code class="cx qq qr qs qt b">istio-proxy</code> and the <code class="cx qq qr qs qt b">discoveryAddress</code> in the same artifact, we can atomically upgrade both.</p><p id="6aac" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">Installation of this artifact is the responsibility of an on-host daemon called <code class="cx qq qr qs qt b">mxagent</code>. It determines what version to install by polling a set of key-value tags on the VM (such as <a class="ag hb" href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.html" rel="noopener ugc nofollow" target="_blank">EC2 tags on AWS</a> or <a class="ag hb" href="https://cloud.google.com/compute/docs/tag-resources" rel="noopener ugc nofollow" target="_blank">resource tags on GCP</a>). These tags mimic the <code class="cx qq qr qs qt b">istio.io/rev</code> label for Kubernetes-based workloads. Whenever they change, <code class="cx qq qr qs qt b">mxagent</code> will download and install the artifact corresponding to that version. Thus, upgrading <code class="cx qq qr qs qt b">istio-proxy</code> on a VM just involves updating these tags on that VM; <code class="cx qq qr qs qt b">mxagent</code> will take care of the rest.</p><p id="b7a3" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">Our VM workloads are largely infrastructure platforms that don’t typically have code deployed at regular intervals. As such, VMs don’t support a deploy-time upgrade (in the way that Kubernetes workloads can be upgraded when they deploy). Similarly, teams cannot roll back these workloads themselves, but this has been acceptable, given that there are just a handful of such infrastructure platforms.</p><p id="7a7a" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">The tag updates are managed by a central controller, <code class="cx qq qr qs qt b">mxrc</code>, which scans for outdated VMs. If <code class="cx qq qr qs qt b">rollouts.yml</code> would result in a different set of resource tags for a VM, the controller will update the tags accordingly. This roughly corresponds to Krispr’s pod admission-time mutation — however, with the caveat that VMs are mutable and long-lived, and thus are upgraded in-place.</p><p id="9b45" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">For safety, <code class="cx qq qr qs qt b">mxrc</code> takes into account the health of the VM, namely in the form of the <a class="ag hb" href="https://istio.io/latest/docs/reference/config/networking/workload-group/#ReadinessProbe" rel="noopener ugc nofollow" target="_blank">readiness probe status on the WorkloadEntry</a>. Similar to Kubernetes’ <code class="cx qq qr qs qt b">maxUnavailable</code> semantics, <code class="cx qq qr qs qt b">mxrc</code> aims to keep the number of unavailable VMs (that is, unhealthy VMs plus those with in-progress upgrades) below a defined percentage. It gradually performs these upgrades, aiming to upgrade all the VMs for a workload in two weeks.</p><p id="0a04" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">At the end of two weeks, all VMs will match the desired state in <code class="cx qq qr qs qt b">rollouts.yml</code>.</p><h1 id="c721" class="ra pu io bf pv rb rc jo gl rd re jr gn rf rg rh ri rj rk rl rm rn ro rp rq rr bk">Conclusion</h1><p id="041a" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">Keeping up-to-date with open-source software is a challenge, especially at scale. Upgrades and other Day-2 operations often become an afterthought, which furthers the burden when upgrades are eventually necessary (to bring in security patches, remain within support windows, utilize new features, and so forth). This is particularly true with Istio, where a version reaches end-of-life support rapidly.</p><p id="aa32" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">Even with the complexity and scale of our service mesh, we have successfully upgraded Istio 14 times. This was made possible due to designing for maintainability, building a process that ensures zero downtime, and derisking through the use of gradual rollouts. Similar processes are in use for a number of other foundational infrastructure systems at Airbnb.</p><h1 id="538c" class="ra pu io bf pv rb rc jo gl rd re jr gn rf rg rh ri rj rk rl rm rn ro rp rq rr bk">Future work</h1><p id="25da" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">As Airbnb’s infrastructure continues to evolve and grow, we’re looking at a few key projects to evolve our service mesh:</p><ul class=""><li id="5b03" class="pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps sa qi qj bk">Utilizing <a class="ag hb" href="https://istio.io/latest/docs/ambient/overview/" rel="noopener ugc nofollow" target="_blank">Ambient mode</a> as a more cost-effective and easier-to-manage deployment model of Istio. In particular, this simplifies upgrades by not needing to touch workload deployments at all.</li><li id="a92f" class="pa pb io pc b jm ql pe pf jp qm ph pi go qn pk pl gr qo pn po gu qp pq pr ps sa qi qj bk">Splitting our singular production mesh into multiple meshes in order to separate fault domains, provide better security isolation boundaries, and scale Istio further. For upgrades, this would further reduce the blast radius, as some meshes that only run low-risk workloads (such as staging) could be upgraded first.</li></ul><p id="a79e" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk">If this type of work interests you, we encourage you to apply for an <a class="ag hb" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">open position</a> today.</p><h1 id="d246" class="ra pu io bf pv rb rc jo gl rd re jr gn rf rg rh ri rj rk rl rm rn ro rp rq rr bk">Acknowledgements</h1><p id="4e08" class="pw-post-body-paragraph pa pb io pc b jm qc pe pf jp qd ph pi go qe pk pl gr qf pn po gu qg pq pr ps hq bk">All of our work with Istio is thanks to many different people, including: Jungho Ahn, Stephen Chan, Weibo He, Douglas Jordan, Brian Wolfe, Edie Yang, Dasol Yoon, and Ying Zhu.</p><p id="9003" class="pw-post-body-paragraph pa pb io pc b jm pd pe pf jp pg ph pi go pj pk pl gr pm pn po gu pp pq pr ps hq bk"><em class="qk">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/seamless-istio-upgrades-at-scale-bcb0e49c5cf8</link>
      <guid>https://medium.com/airbnb-engineering/seamless-istio-upgrades-at-scale-bcb0e49c5cf8</guid>
      <pubDate>Thu, 07 Aug 2025 19:01:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Achieving High Availability with distributed database on Kubernetes at Airbnb]]></title>
      <description><![CDATA[<div><div></div><figure class="ny nz oa ob oc od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fl og bh oh">Zoom image will be displayed<div class="nv nw nx"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*X2h18kiTtfcbRXQo 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*X2h18kiTtfcbRXQo 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*X2h18kiTtfcbRXQo 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*X2h18kiTtfcbRXQo 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*X2h18kiTtfcbRXQo 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*X2h18kiTtfcbRXQo 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*X2h18kiTtfcbRXQo 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*X2h18kiTtfcbRXQo 640w, https://miro.medium.com/v2/resize:fit:720/0*X2h18kiTtfcbRXQo 720w, https://miro.medium.com/v2/resize:fit:750/0*X2h18kiTtfcbRXQo 750w, https://miro.medium.com/v2/resize:fit:786/0*X2h18kiTtfcbRXQo 786w, https://miro.medium.com/v2/resize:fit:828/0*X2h18kiTtfcbRXQo 828w, https://miro.medium.com/v2/resize:fit:1100/0*X2h18kiTtfcbRXQo 1100w, https://miro.medium.com/v2/resize:fit:1400/0*X2h18kiTtfcbRXQo 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="1068" class="oo op io bf oq or os ot gl ou ov ow gn ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Introduction</h1><p id="16be" class="pw-post-body-paragraph pk pl io pm b pn po pp pq pr ps pt pu go pv pw px gr py pz qa gu qb qc qd qe hq bk">Traditionally, organizations have deployed databases on costly, high-end standalone servers using sharding for scaling as a strategy. As data demands grew, the limitations of this strategy became increasingly evident with increasingly longer and more complex maintenance projects.</p><p id="5c7c" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">Increasingly distributed horizontally scalable databases are not uncommon and many of them are open source. However, running these databases reliably in the cloud with high availability, low latency and scalability, all at a reasonable cost is a problem many companies are trying to solve.</p><p id="0c08" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">We chose an innovative strategy of deploying<strong class="pm ip"> a distributed database cluster across multiple Kubernetes clusters in a cloud environment</strong>. Although currently an uncommon design pattern due to its complexity, this strategy allowed us to achieve target system reliability and operability.</p><p id="db41" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">In this post, we’ll share how we overcame challenges and the best practices we’ve developed for this strategy and we believe these best practices should be applicable to any other strongly consistent, distributed storage systems.</p><h1 id="2b3b" class="oo op io bf oq or os ot gl ou ov ow gn ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Managing Databases on Kubernetes</h1><p id="0898" class="pw-post-body-paragraph pk pl io pm b pn po pp pq pr ps pt pu go pv pw px gr py pz qa gu qb qc qd qe hq bk">Earlier this year, we integrated an open source horizontally scalable, distributed SQL database into our infrastructure.</p><p id="a5a9" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">While Kubernetes is a great tool for running stateless services, the use of Kubernetes for stateful services — like databases — is challenging, particularly around node replacement and upgrades.</p><p id="3478" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">Since Kubernetes lacks knowledge of data distribution across nodes, each node replacement requires careful data handling to prevent data quorum loss and service disruption, this includes copying the data before replacing a node.</p><p id="dc0d" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">At Airbnb, we opted to attach storage volumes to nodes using AWS EBS, this allows quick volume reattachment to new virtual machines upon node replacement. Thanks to Kubernetes’ Persistent Volume Claims (<a class="ag hb" href="https://kubernetes.io/docs/concepts/storage/persistent-volumes/#binding" rel="noopener ugc nofollow" target="_blank">PVC</a>), this reattachment happens automatically. In addition we need to allow time for a new storage node to catch up with the cluster’s current state before moving to the next node replacement. For this, we rely on the custom <a class="ag hb" href="https://kubernetes.io/docs/concepts/extend-kubernetes/operator/" rel="noopener ugc nofollow" target="_blank">k8s operator</a><a class="ag hb" href="https://github.com/pingcap/tidb-operator" rel="noopener ugc nofollow" target="_blank">,</a> which allows us to customize various Kubernetes operations according to specifics of the application.</p><h1 id="1271" class="oo op io bf oq or os ot gl ou ov ow gn ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Coordinating Node Replacement</h1><p id="fbca" class="pw-post-body-paragraph pk pl io pm b pn po pp pq pr ps pt pu go pv pw px gr py pz qa gu qb qc qd qe hq bk">Node replacements occur for various reasons, from <a class="ag hb" href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-retirement.html" rel="noopener ugc nofollow" target="_blank">AWS instance retirement</a> to Kubernetes upgrades or configuration changes. To address these cases, we categorize node replacement events into three groups:</p><ol class=""><li id="b4fa" class="pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe qk ql qm bk"><strong class="pm ip">Database-initiated events:</strong> Such as config changes or version upgrades.</li><li id="3bc8" class="pk pl io pm b pn qn pp pq pr qo pt pu go qp pw px gr qq pz qa gu qr qc qd qe qk ql qm bk"><strong class="pm ip">Proactive infrastructure events:</strong> Like instance retirements or node upgrades.</li><li id="5375" class="pk pl io pm b pn qn pp pq pr qo pt pu go qp pw px gr qq pz qa gu qr qc qd qe qk ql qm bk"><strong class="pm ip">Unplanned infrastructure failures:</strong> Such as a node becoming unresponsive.</li></ol><p id="135c" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">To safely manage node replacements for database-initiated events, we implemented a a custom check in the k8s-operator that verifies that all nodes are up and running before deleting any pod.</p><p id="1706" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">In order to serialize it with the second group initiated by infrastructure, we implemented <a class="ag hb" href="https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/" rel="noopener ugc nofollow" target="_blank">an admission hook</a> in k8s to intercept pod eviction. This admission hook rejects any attempt to evict the pod, but assigns a custom annotation on the pod which our customer database k8s-operator watches and acts on to safely delete the pod serializing it with any database-initiated node replacements described above.</p><p id="a8e4" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">Node replacements due to unplanned infrastructure failure events like hardware failure, can’t be coordinated. But we can still improve availability by ensuring that any node replacement event from the first two groups will be blocked until the failed hardware is replaced.</p><p id="ef90" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">In our infrastructure the k8s operator handles both proactive and infrastructure-triggered node replacements, maintaining data consistency in the presence of node replacements and ensuring that unplanned events don’t impact ongoing maintenance.</p><h1 id="b959" class="oo op io bf oq or os ot gl ou ov ow gn ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Kubernetes Upgrades</h1><p id="9023" class="pw-post-body-paragraph pk pl io pm b pn po pp pq pr ps pt pu go pv pw px gr py pz qa gu qb qc qd qe hq bk">Regular Kubernetes upgrades are essential but can be high-risk operations, especially for databases. Cloud managed Kubernetes might not offer rollbacks once the control plane is upgraded, posing a potential disaster recovery challenge if something goes wrong. While our approach involves using self-managed Kubernetes clusters, which does allow rolling back the control plane, a bad Kubernetes upgrade could still cause service disruption till rollback is completed.</p><h1 id="5a24" class="oo op io bf oq or os ot gl ou ov ow gn ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Ensuring Fault Tolerance with Multiple Kubernetes clusters</h1><p id="85c6" class="pw-post-body-paragraph pk pl io pm b pn po pp pq pr ps pt pu go pv pw px gr py pz qa gu qb qc qd qe hq bk">At Airbnb, we think the best way to achieve high regional availability is to deploy each database across three independent Kubernetes clusters, each within a different AWS availability zone (<a class="ag hb" href="https://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/availability-zones.htm" rel="noopener ugc nofollow" target="_blank">AZ</a>). AWS uses availability zones not just for independent power, networking, and connectivity, but they also do rollouts zone by zone. Our Kubernetes cluster alignment with AWS AZ also means that any underlying infrastructure issues or bad deployments have a limited blast radius as they are restricted to a single AZ. Internally, we also deploy a new configuration or a new database version to a part of the logical cluster running in a single Kubernetes cluster in one AZ first.</p><p id="7fd7" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">While this setup adds complexity, it significantly boosts availability by limiting the blast radius of any issues stemming from faulty deployments at every layer — whether database, Kubernetes, or AWS infrastructure.</p><p id="8d22" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">For instance, recently, a faulty config deployment in our infrastructure abruptly terminated all VMs of a specific type in our staging Kubernetes cluster, deleting most of the query layer pods. However, since the disruption was isolated to a single Kubernetes cluster, two-thirds of our query layer nodes remained operational, preventing any impact.</p><p id="dc1c" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">We also overprovision our database clusters to ensure that, even if an entire AZ, Kubernetes cluster, or all storage nodes within a zone goes down, we still have sufficient capacity to handle traffic.</p><figure class="qt qu qv qw qx od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fl og bh oh">Zoom image will be displayed<div class="nv nw qs"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*1hSoKkktmPABPkTj 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*1hSoKkktmPABPkTj 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*1hSoKkktmPABPkTj 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*1hSoKkktmPABPkTj 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*1hSoKkktmPABPkTj 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*1hSoKkktmPABPkTj 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*1hSoKkktmPABPkTj 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*1hSoKkktmPABPkTj 640w, https://miro.medium.com/v2/resize:fit:720/0*1hSoKkktmPABPkTj 720w, https://miro.medium.com/v2/resize:fit:750/0*1hSoKkktmPABPkTj 750w, https://miro.medium.com/v2/resize:fit:786/0*1hSoKkktmPABPkTj 786w, https://miro.medium.com/v2/resize:fit:828/0*1hSoKkktmPABPkTj 828w, https://miro.medium.com/v2/resize:fit:1100/0*1hSoKkktmPABPkTj 1100w, https://miro.medium.com/v2/resize:fit:1400/0*1hSoKkktmPABPkTj 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="6625" class="oo op io bf oq or os ot gl ou ov ow gn ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Leveraging AWS EBS for Reliability and Latency Handling</h1><p id="8f05" class="pw-post-body-paragraph pk pl io pm b pn po pp pq pr ps pt pu go pv pw px gr py pz qa gu qb qc qd qe hq bk">EBS offers two key benefits for our deployment: rapid reattachment during node replacements and superior durability compared to local disks. With EBS, we confidently run a highly available cluster using only three replicas, maintaining reliability without needing additional redundancy.</p><p id="cead" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">However, EBS can occasionally experience tail latency spikes, with p99 latency reaching up to 1 second. To mitigate this, we implemented a storage read timeout session variable, allowing queries to transparently retry against other storage nodes during EBS latency spikes. By default the database we use sends all requests and retries to the leader. To enable retries on storage nodes with healthy EBS, we have to allow reads from both leader and replica reads, but prefer the closest one for the original request. This brings the added benefit of reduced latency and no cross-AZ network costs, as we have a replica in each AZ. Finally, for use cases that permit it, we leverage stale reads feature, enabling reads to be served independently by the replica without requiring synchronous calls to the leader, which may be experiencing an EBS latency spike at the time of the read.</p><h1 id="af63" class="oo op io bf oq or os ot gl ou ov ow gn ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Conclusion: Exploring Open Source Databases on Kubernetes</h1><p id="1843" class="pw-post-body-paragraph pk pl io pm b pn po pp pq pr ps pt pu go pv pw px gr py pz qa gu qb qc qd qe hq bk">Our journey running a distributed database on Kubernetes has empowered us to achieve high availability, low latency, scalability, and lower maintenance costs. By leveraging the operator pattern, multi-cluster deployments, AWS EBS, and stale reads, we’ve demonstrated that even open source distributed storage systems can thrive in cloud environments.</p><p id="9926" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">We already operate several database clusters in production in the described setup, with the largest one handling 3M QPS across 150 storage nodes, storing over 300+ TB of data spread across 4M internal shards. All this with 99.95% availability thanks to techniques described in this post.</p><p id="59fa" class="pw-post-body-paragraph pk pl io pm b pn qf pp pq pr qg pt pu go qh pw px gr qi pz qa gu qj qc qd qe hq bk">For other companies considering to run open-source databases on Kubernetes, the opportunities are immense. Embrace the challenge, run open-source databases to shape these tools for enterprise use. The future of scalable, reliable data management in the cloud lies in collaboration and open-source innovation — now is the time to lead and participate.</p><h1 id="fea3" class="oo op io bf oq or os ot gl ou ov ow gn ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Acknowledgments</h1><p id="7fdd" class="pw-post-body-paragraph pk pl io pm b pn po pp pq pr ps pt pu go pv pw px gr py pz qa gu qb qc qd qe hq bk">Thanks to Abhishek Parmar, Brian Wolfe, Chen Ding, Daniel Low, Hao Luo, Xiaomou Wang for collaboration and Shylaja Ramachandra for editing.</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/achieving-high-availability-with-distributed-database-on-kubernetes-at-airbnb-58cc2e9856f4</link>
      <guid>https://medium.com/airbnb-engineering/achieving-high-availability-with-distributed-database-on-kubernetes-at-airbnb-58cc2e9856f4</guid>
      <pubDate>Mon, 28 Jul 2025 19:57:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Understanding and Improving SwiftUI Performance]]></title>
      <description><![CDATA[<div><div></div><p id="7d50" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">New techniques we’re using at Airbnb to improve and maintain performance of SwiftUI features at scale</p><p id="8049" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">By <a class="ag hb" href="https://www.linkedin.com/in/calstephens/" rel="noopener ugc nofollow" target="_blank">Cal Stephens</a>, <a class="ag hb" href="https://www.linkedin.com/in/miguel-jimenez-b98216112" rel="noopener ugc nofollow" target="_blank">Miguel Jimenez</a></p><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox oy"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*qBYJ9abMpZyuODmbHkYD5Q.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="f5d5" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Airbnb <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/unlocking-swiftui-at-airbnb-ea58f50cde49" data-discover="true">first adopted SwiftUI in 2022</a>, starting with individual components and later expanding to entire screens and features. We’ve seen major improvements to engineers’ productivity thanks to its declarative, flexible, and composable architecture. However, adopting SwiftUI has brought new challenges related to performance. For example, there are many common code patterns in SwiftUI that can be inefficient, and many small papercuts can add up to a large cumulative performance hit. To begin addressing some of these issues at scale, we’ve created new tooling for proactively identifying these cases and statically validating correctness.</p><h1 id="21b1" class="pk pl io bf pm pn po pp gl pq pr ps gn pt pu pv pw px py pz qa qb qc qd qe qf bk">SwiftUI feature architecture at Airbnb</h1><p id="6a39" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">We’ve been leveraging declarative UI patterns at Airbnb for many years, using our UIKit-based <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/introducing-epoxy-for-ios-6bf062be1670" data-discover="true">Epoxy library</a> and <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/introducing-epoxy-for-ios-6bf062be1670#fbe0" data-discover="true">unidirectional data flow</a> systems. When adopting SwiftUI in our screen layer, we decided to continue using our existing unidirectional data flow library. This simplified the process of incrementally adopting SwiftUI within our large codebase, and we find it improves the quality and maintainability of features.</p><p id="8863" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">However, we noticed that SwiftUI features using our unidirectional data flow library didn’t perform as well as we expected, and it wasn’t immediately obvious to us what the problem was. Understanding SwiftUI’s performance characteristics is an important requirement for building performant and outside of the “standard” SwiftUI toolbox.</p><h1 id="2fa3" class="pk pl io bf pm pn po pp gl pq pr ps gn pt pu pv pw px py pz qa qb qc qd qe qf bk">Understanding SwiftUI view diffing</h1><p id="5841" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">When working with declarative UI systems like SwiftUI, it’s important to ensure the framework knows which views need to be re-evaluated and re-rendered when the state of the screen changes. Changes are detected by diffing the view’s stored properties any time its parent is updated. Ideally the view’s body will only be re-evaluated when its properties actually change:</p><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox ql"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*LvNBJSor0RDThlW3Oq7mWw.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*LvNBJSor0RDThlW3Oq7mWw.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*LvNBJSor0RDThlW3Oq7mWw.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*LvNBJSor0RDThlW3Oq7mWw.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*LvNBJSor0RDThlW3Oq7mWw.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*LvNBJSor0RDThlW3Oq7mWw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*LvNBJSor0RDThlW3Oq7mWw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*LvNBJSor0RDThlW3Oq7mWw.png 640w, https://miro.medium.com/v2/resize:fit:720/1*LvNBJSor0RDThlW3Oq7mWw.png 720w, https://miro.medium.com/v2/resize:fit:750/1*LvNBJSor0RDThlW3Oq7mWw.png 750w, https://miro.medium.com/v2/resize:fit:786/1*LvNBJSor0RDThlW3Oq7mWw.png 786w, https://miro.medium.com/v2/resize:fit:828/1*LvNBJSor0RDThlW3Oq7mWw.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*LvNBJSor0RDThlW3Oq7mWw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*LvNBJSor0RDThlW3Oq7mWw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="1b24" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">However, this behavior is not always the reality (more on why in a moment). Unnecessary view body evaluations hurt performance by performing unnecessary work.</p><p id="6e99" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">How do you know how often a view’s body is re-evaluated in a real app? An easy way to visualize this is with a modifier that applies a random color to the view every time it’s rendered. When testing this on various views in our app’s most performance-sensitive screens, we quickly found that many views were re-evaluated and re-rendered more often than necessary:</p><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox qm"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 640w, https://miro.medium.com/v2/resize:fit:720/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 720w, https://miro.medium.com/v2/resize:fit:750/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 750w, https://miro.medium.com/v2/resize:fit:786/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 786w, https://miro.medium.com/v2/resize:fit:828/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 828w, https://miro.medium.com/v2/resize:fit:1100/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 1100w, https://miro.medium.com/v2/resize:fit:1400/1*yNrr8e8yI9RcKg6Z-VUhEA.gif 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h2 id="8688" class="qn pl io bf pm gk qo dy gl gm qp ea gn go qq gp gq gr qr gs gt gu qs gv gw qt bk">The SwiftUI view diffing algorithm</h2><p id="2287" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">SwiftUI’s built-in diffing algorithm is often overlooked and not officially documented, but it has a huge impact on performance. To determine if a view’s body needs to be re-evaluated, SwiftUI uses a reflection-based diffing algorithm to compare each of the view’s stored properties:</p><ol class=""><li id="3445" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov qu qv qw bk">If a type is <em class="qx">Equatable</em>, SwiftUI compares the old and new values using the type’s <em class="qx">Equatable</em> conformance. Otherwise:</li><li id="b3dd" class="ob oc io od b oe qy og oh oi qz ok ol go ra on oo gr rb oq or gu rc ot ou ov qu qv qw bk">SwiftUI compares value types (e.g., structs) by recursively comparing each instance property.</li><li id="d263" class="ob oc io od b oe qy og oh oi qz ok ol go ra on oo gr rb oq or gu rc ot ou ov qu qv qw bk">SwiftUI compares reference types (e.g., classes) using reference identity.</li><li id="6503" class="ob oc io od b oe qy og oh oi qz ok ol go ra on oo gr rb oq or gu rc ot ou ov qu qv qw bk">SwiftUI attempts to compare closures by identity. However, most non-trivial closures cannot be compared reliably.</li></ol><p id="3b35" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">If all of the view’s properties compare as equal to the previous value, then the body isn’t re-evalulated and the content isn’t re-rendered. Values using SwiftUI property wrappers like<em class="qx"> @State </em>and <em class="qx">@Environment</em> don’t participate in this diffing algorithm, and instead trigger view updates through different mechanisms.</p><p id="7caf" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">When reviewing different views in our codebase, we found several common patterns that confounded SwiftUI’s diffing algorithm:</p><ol class=""><li id="ca6d" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov qu qv qw bk">Some types are inherently not supported, like closures.</li><li id="62fa" class="ob oc io od b oe qy og oh oi qz ok ol go ra on oo gr rb oq or gu rc ot ou ov qu qv qw bk">Simple data types stored on the view may be unexpectedly compared by reference instead of by value.</li></ol><p id="ad3d" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Here’s an example SwiftUI view with properties that interact poorly with the diffing algorithm:</p><pre class="oz pa pb pc pd rd re rf bp rg bb bk">struct MyView: View {<br />  /// A generated data model that is a struct with value semantics,<br />  /// but is copy-on-write and wraps an internal reference type.<br />  /// Compared by reference, not by value, which could cause unwanted body evaluations.<br />  let dataModel: CopyOnWriteDataModel/// Other miscellaneous properties used by the view. Typically structs, but sometimes a class.<br />  /// Unexpected comparisons by reference could cause unwanted body evaluations.<br />  let requestState: MyFeatureRequestState/// An action handler for this view, part of our unidirectional data flow library. <br />  /// Wraps a closure that routes the action to the screen's action handler.<br />  /// Closures almost always compare as not-equal, and typically cause unwanted body evaluations. <br />  let handler: Handler&lt;MyViewAction&gt;var body: some View { ... }<br />}</pre><p id="b83f" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">If a view contains any value that isn’t diffable, the entire view becomes non-diffable. Preventing this in a scalable way is almost impossible with existing tools. This finding also reveals the performance issue caused by our unidirectional data flow library: action handling is closure-based, but SwiftUI can’t diff closures!</p><p id="655d" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">In some cases, like with the action handlers from our unidirectional data flow library, making the value diffable would require large, invasive, and potentially undesirable architecture changes. Even in simpler cases, this process is still time consuming, and there’s no easy way to prevent a regression from creeping in later on. This is a big obstacle when trying to improve and maintain performance at scale in large codebases with many different contributors.</p><h1 id="d1a2" class="pk pl io bf pm pn po pp gl pq pr ps gn pt pu pv pw px py pz qa qb qc qd qe qf bk">Controlling SwiftUI view diffing</h1><p id="89a9" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">Fortunately, we have another option: If a view conforms to Equatable, SwiftUI will diff it using its Equatable conformance <em class="qx">instead</em> of using the default reflection-based diffing algorithm.</p><p id="8e46" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">The advantage of this approach is that it lets us selectively decide which properties should be compared when diffing our view. In our case, we know that the handler object doesn’t affect the content or identity of our view. We only want our view to be re-evalulated and re-rendered when the <em class="qx">dataModel</em> and <em class="qx">requestState</em> values are updated. We can express that with a custom <em class="qx">Equatable</em> implementation:</p><pre class="oz pa pb pc pd rd re rf bp rg bb bk">// An Equatable conformance that makes the above SwiftUI view diffable.<br />extension MyView: Equatable {<br />  static func ==(lhs: MyView, rhs: MyView) -&gt; Bool {<br />    lhs.dataModel == rhs.dataModel<br />      &amp;&amp; lhs.requestState == rhs.requestState<br />      // Intentionally not comparing handler, which isn't Equatable.<br />  }<br />}</pre><p id="f9de" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">However:</p><ol class=""><li id="97d7" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov qu qv qw bk">This is a lot of additional boilerplate for engineers to write, especially for views with lots of properties.</li><li id="de77" class="ob oc io od b oe qy og oh oi qz ok ol go ra on oo gr rb oq or gu rc ot ou ov qu qv qw bk">Writing and maintaining a custom conformance is error-prone. You can easily forget to update the <em class="qx">Equatable</em> conformance when adding new properties later, which would cause bugs.</li></ol><p id="9a18" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">So, instead of manually writing and maintaining <em class="qx">Equatable</em> conformances, we created a new<em class="qx"> @Equatable </em>macro that generates conformances for us.</p><pre class="oz pa pb pc pd rd re rf bp rg bb bk">// A sample SwiftUI view that has adopted @Equatable<br />// and is now guaranteed to be diffable.<br />@Equatable<br />struct MyView: View {<br />  // Simple data types must be Equatable, or the build will fail.<br />  let dataModel: CopyOnWriteDataModel<br />  let requestState: MyFeatureRequestState// Types that aren't Equatable can be excluded from the<br />  // generated Equatable conformance using @SkipEquatable,<br />  // as long as they don’t affect the output of the view body.<br />  @SkipEquatable let handler: Handler&lt;MyViewAction&gt;var body: some View { ... }<br />}</pre><p id="2551" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">The <em class="qx">@Equatable</em> macro generates an <em class="qx">Equatable</em> implementation that compares all of the view’s stored instance properties, excluding properties with SwiftUI property wrappers like<em class="qx">@State </em>and <em class="qx">@Environemnt</em> that trigger view updates through other mechanisms. Properties that aren’t <em class="qx">Equatable</em> and don’t affect the output of the view body can be marked with <em class="qx">@SkipEquatable</em> to exclude them from the generated implementation. This allows us to continue using the closure-based action handlers from our unidirectional data flow library without impacting the SwiftUI diffing process!</p><p id="1055" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">After adopting the <em class="qx">@Equatable</em> macro on a view, that view is guaranteed to be diffable. If an engineer adds a non-<em class="qx">Equatable</em> property later, the build will fail, highlighting a potential regression in the diffing behavior. This effectively makes the <em class="qx">@Equatable</em> macro a sophisticated linter — which is really valuable for scaling these performance improvements in a codebase with many components and many contributors, since it makes it less likely for regressions to slip in later.</p><h1 id="65ab" class="pk pl io bf pm pn po pp gl pq pr ps gn pt pu pv pw px py pz qa qb qc qd qe qf bk">Managing the size of view bodies</h1><p id="8976" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">Another essential aspect of SwiftUI diffing is understanding that SwiftUI can only diff proper View structs. Any other code, such as computed properties or helper functions that generate a SwiftUI view, cannot be diffed.</p><p id="affb" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Consider the following example:</p><pre class="oz pa pb pc pd rd re rf bp rg bb bk">// Complex SwiftUI views are often simplified by<br />// splitting the view body into separate computed properties.<br />struct MyScreen: View {<br />  /// The unidirectional data flow state store for this feature.<br />  @ObservedObject var store: StateStore&lt;MyState, MyAction&gt;var body: some View {<br />    VStack {<br />      headerSection<br />      actionCardSection<br />    }<br />  }private var headerSection: some View {<br />    Text(store.state.titleString)<br />      .textStyle(.title)<br />  }private var actionCardSection: some View {<br />    VStack {<br />      Image(store.state.cardSelected ? "enabled" : "disabled")<br />      Text("This is a selectable card")<br />    }<br />    .strokedCard(.roundedRect_mediumCornerRadius_12)<br />    .scaleEffectButton(action: {<br />      store.handle(.cardTapped) <br />    })<br />  }<br />}</pre><p id="150a" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">This is a common way to organize complex view bodies, since it makes the code easier to read and maintain. However, at runtime, SwiftUI effectively inlines the views returned from the properties into the main view body, as if we instead wrote:</p><pre class="oz pa pb pc pd rd re rf bp rg bb bk">// At runtime, computed properties are no different<br />// from just having a single, large view body!<br />struct MyScreen: View {<br />  @ObservedObject var store: StateStore&lt;MyState, MyAction&gt;// Re-evaluated every time the state of the screen is updated.<br />  var body: some View {<br />    Text(store.state.titleString)<br />      .textStyle(.title)VStack {<br />      Image(store.state.cardSelected ? "enabled" : "disabled")<br />      Text("This is a selectable card")<br />    }<br />    .strokedCard(.roundedRect_mediumCornerRadius_12)<br />    .scaleEffectButton(action: {<br />      store.handle(.cardTapped) <br />    })<br />  }<br />}</pre><p id="a040" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Since all of this code is part of the same view body, all of it will be re-evaluated when any part of the screen’s state changes. While this specific example is simple, as the view grows larger and more complicated, re-evaluating it will become more expensive. Eventually there would be a large amount of unnecessary work happening on every screen update, hurting performance.</p><p id="f96e" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">To improve performance, we can implement the layout code in separate SwiftUI views. This allows SwiftUI to properly diff each child view, only re-evaluating their bodies when necessary:</p><pre class="oz pa pb pc pd rd re rf bp rg bb bk">struct MyScreen: View {<br />  @ObservedObject var store: StateStore&lt;MyState, MyAction&gt;var body: some View {<br />    VStack {<br />      HeaderSection(title: store.state.titleString)<br />      CardSection(<br />       isCardSelected: store.state.isCardSelected,<br />       handler: store.handler)<br />    }<br />  }<br />}/// Only re-evaluated and re-rendered when the title property changes.<br />@Equatable<br />struct HeaderSection: View {<br />  let title: Stringvar body: some View {<br />    Text(title)<br />      .textStyle(.title)<br />  }<br />}/// Only re-evaluated and re-rendered when the isCardSelected property changes.<br />@Equatable<br />struct CardSection: View {<br />  let isCardSelected: Bool<br />  @SkipEquatable let handler: Handler&lt;MyAction&gt;var body: some View {<br />    VStack {<br />      Image(store.state.isCardSelected ? "enabled" : "disabled")<br />      Text("This is a selectable card")<br />    }<br />    .strokedCard(.roundedRect_mediumCornerRadius_12)<br />    .scaleEffectButton(action: {<br />      handler.handle(.cardTapped) <br />    })<br />  }<br />}</pre><p id="eb39" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">By breaking the view into smaller, diffable pieces, SwiftUI can efficiently update only the parts of the view that actually changed. This approach helps maintain performance as a feature grows more complex.</p><h2 id="3869" class="qn pl io bf pm gk qo dy gl gm qp ea gn go qq gp gq gr qr gs gt gu qs gv gw qt bk">View body complexity lint rule</h2><p id="3e58" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">Large, complex views aren’t always obvious during development. Easily available metrics like total line count aren’t a good proxy for complexity. To help engineers know when it’s time to refactor a view into smaller, diffable pieces, we created a custom <a class="ag hb" href="https://github.com/realm/SwiftLint" rel="noopener ugc nofollow" target="_blank">SwiftLint</a> rule that parses the view body using <a class="ag hb" href="https://github.com/swiftlang/swift-syntax" rel="noopener ugc nofollow" target="_blank">SwiftSyntax</a> and measures its complexity. We defined the view complexity metric as a value that increases every time you compose views using computed properties, functions, or closures. With this rule we automatically trigger an alert in Xcode when a view is getting too complex. (The complexity limit is configurable, and we currently allow a maximum complexity level of 10.)</p><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox rm"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*Tdo1L8qZf81FWFeJaNY6yQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*Tdo1L8qZf81FWFeJaNY6yQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*Tdo1L8qZf81FWFeJaNY6yQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*Tdo1L8qZf81FWFeJaNY6yQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*Tdo1L8qZf81FWFeJaNY6yQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*Tdo1L8qZf81FWFeJaNY6yQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Tdo1L8qZf81FWFeJaNY6yQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*Tdo1L8qZf81FWFeJaNY6yQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*Tdo1L8qZf81FWFeJaNY6yQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*Tdo1L8qZf81FWFeJaNY6yQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*Tdo1L8qZf81FWFeJaNY6yQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*Tdo1L8qZf81FWFeJaNY6yQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*Tdo1L8qZf81FWFeJaNY6yQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*Tdo1L8qZf81FWFeJaNY6yQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rn ff ro ow ox rp rq bf b bg ab du">The rule shows as a warning during local Xcode builds alerting engineers as early as possible. In this screenshot, the complexity limit is set to 3, and this specific view has a complexity of 5.</figcaption></figure><h1 id="c15f" class="pk pl io bf pm pn po pp gl pq pr ps gn pt pu pv pw px py pz qa qb qc qd qe qf bk">Conclusion</h1><p id="ed7c" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">With an understanding of how SwiftUI view diffing works, we can use an <em class="qx">@Equatable</em> macro to ensure view bodies are only re-evaluated when the values inside views actually change, break views into smaller parts for faster re-evaluation, and encourage developers to refactor views before they get too large and complex.</p><p id="3882" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Applying these three techniques to SwiftUI views in our app has led to a large reduction in unnecessary view re-evaluation and re-renders. Revisiting the examples from earlier, you see far fewer re-renders in the search bar and filter panel:</p><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox rr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 640w, https://miro.medium.com/v2/resize:fit:720/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 720w, https://miro.medium.com/v2/resize:fit:750/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 750w, https://miro.medium.com/v2/resize:fit:786/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 786w, https://miro.medium.com/v2/resize:fit:828/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 828w, https://miro.medium.com/v2/resize:fit:1100/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 1100w, https://miro.medium.com/v2/resize:fit:1400/1*tWhEXK5kyFP5KYPCUSvp6Q.gif 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="083c" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Using results from our <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/airbnbs-page-performance-score-on-ios-36d5f200bc73" data-discover="true">page performance score</a> system, we’ve found that adopting these techniques in our most complicated SwiftUI screens really does improve performance for our users. For example, we reduced <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/airbnbs-page-performance-score-on-ios-36d5f200bc73#4c63" data-discover="true">scroll hitches</a> by15% on our main Search screen by adopting <em class="qx">@Equatable</em> on its most important views, and breaking apart large view bodies into smaller diffable pieces. These techniques also give us the flexibility to use a feature architecture that best suits our needs without compromising performance or imposing burdensome limitations (e.g., completely avoiding closures in SwiftUI views).</p><p id="91b5" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Of course, these techniques aren’t a silver bullet. It’s not necessary for all SwiftUI features to use them, and these techniques by themselves aren’t enough to guarantee great performance. However, understanding how and why they work serves as a valuable foundation for building performant SwiftUI features, and makes it easier to spot and avoid problematic patterns in your own code.</p><p id="4eb4" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">If you’re interested in joining us on our quest to make the best iOS app in the App Store, please see our <a class="ag hb" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">careers</a> page for open iOS roles.</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/understanding-and-improving-swiftui-performance-37b77ac61896</link>
      <guid>https://medium.com/airbnb-engineering/understanding-and-improving-swiftui-performance-37b77ac61896</guid>
      <pubDate>Tue, 24 Jun 2025 18:43:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Load Testing with Impulse at Airbnb]]></title>
      <description><![CDATA[<div><div></div><p id="867b" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Comprehensive Load Testing with Load Generator, Dependency Mocker, Traffic Collector, and More</p><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox oy"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*3LijjQrJDLVA_ptfeRe83g.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*3LijjQrJDLVA_ptfeRe83g.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*3LijjQrJDLVA_ptfeRe83g.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*3LijjQrJDLVA_ptfeRe83g.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*3LijjQrJDLVA_ptfeRe83g.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*3LijjQrJDLVA_ptfeRe83g.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*3LijjQrJDLVA_ptfeRe83g.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*3LijjQrJDLVA_ptfeRe83g.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*3LijjQrJDLVA_ptfeRe83g.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*3LijjQrJDLVA_ptfeRe83g.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*3LijjQrJDLVA_ptfeRe83g.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*3LijjQrJDLVA_ptfeRe83g.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*3LijjQrJDLVA_ptfeRe83g.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*3LijjQrJDLVA_ptfeRe83g.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="d5e7" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Authors: <a class="ag hb" href="https://www.linkedin.com/in/chenhao-yang-9799b022/" rel="noopener ugc nofollow" target="_blank">Chenhao Yang</a>, <a class="ag hb" href="https://www.linkedin.com/in/haoyue-wang-a722509a/" rel="noopener ugc nofollow" target="_blank">Haoyue Wang</a>, <a class="ag hb" href="https://www.linkedin.com/in/xiaoyawei/" rel="noopener ugc nofollow" target="_blank">Xiaoya Wei</a>, <a class="ag hb" href="https://www.linkedin.com/in/zhijie-guan/" rel="noopener ugc nofollow" target="_blank">Zay Guan</a>, <a class="ag hb" href="https://www.linkedin.com/in/yaolin-chen-591a31339/" rel="noopener ugc nofollow" target="_blank">Yaolin Chen</a> and <a class="ag hb" href="https://www.linkedin.com/in/fei-yuan/" rel="noopener ugc nofollow" target="_blank">Fei Yuan</a></p><p id="bc87" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">System-level load testing is crucial for reliability and efficiency. It identifies bottlenecks, evaluates capacity for peak traffic, establishes performance baselines, and detects errors. At a company of Airbnb’s size and complexity, we’ve learned that load testing needs to be robust, flexible, and decentralized. This requires the right set of tools to enable engineering teams to do self-service load tests that integrate seamlessly with CI.</p><p id="d857" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Impulse is one of our internal load-testing-as-a-service frameworks. It provides tools that can generate synthetic loads, mock dependencies, and collect traffic data from production environments. In this blog post, we’ll share how Impulse is architected to minimize manual effort, seamlessly integrate with our observability stack, and empower teams to proactively address potential issues.</p><h1 id="2c65" class="pk pl io bf pm pn po pp gl pq pr ps gn pt pu pv pw px py pz qa qb qc qd qe qf bk">Architecture</h1><p id="806d" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">Impulse is a comprehensive load testing framework that allows service owners to conduct context-aware load tests, mock dependencies, and collect traffic data to ensure the system’s performance under various conditions. It includes the following components:</p><ol class=""><li id="95b3" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov ql qm qn bk"><strong class="od ip">Load generator</strong> to generate context-aware requests on the fly, for testing different scenarios with synthetic or collected traffic.</li><li id="11bc" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov ql qm qn bk"><strong class="od ip">Dependency mocker</strong> to mock the downstream responses with latency, so that the load testing on the service under test (SUT) doesn’t need to involve certain dependent services. This is especially crucial when the dependencies are vendor services that don’t support load testing, or if the team wants to regression load test their service during day-to-day deployment without affecting downstreams.</li><li id="a388" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov ql qm qn bk"><strong class="od ip">Traffic collector</strong> to collect both the upstream and downstream traffic from the production environment, and then apply the resulting data to the test environment.</li><li id="3944" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov ql qm qn bk"><strong class="od ip">Testing API generator</strong> to wrap asynchronous workflows into synchronous API calls for load testing.</li></ol><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox qt"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*SFDblGijyiLQfI7C5RpPXw.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*SFDblGijyiLQfI7C5RpPXw.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*SFDblGijyiLQfI7C5RpPXw.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*SFDblGijyiLQfI7C5RpPXw.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*SFDblGijyiLQfI7C5RpPXw.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*SFDblGijyiLQfI7C5RpPXw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*SFDblGijyiLQfI7C5RpPXw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*SFDblGijyiLQfI7C5RpPXw.png 640w, https://miro.medium.com/v2/resize:fit:720/1*SFDblGijyiLQfI7C5RpPXw.png 720w, https://miro.medium.com/v2/resize:fit:750/1*SFDblGijyiLQfI7C5RpPXw.png 750w, https://miro.medium.com/v2/resize:fit:786/1*SFDblGijyiLQfI7C5RpPXw.png 786w, https://miro.medium.com/v2/resize:fit:828/1*SFDblGijyiLQfI7C5RpPXw.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*SFDblGijyiLQfI7C5RpPXw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*SFDblGijyiLQfI7C5RpPXw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qu ff qv ow ox qw qx bf b bg ab du">Figure 1: The Impulse framework and its four main components</figcaption></figure><p id="f043" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Each of these four tools are independent, allowing service owners the flexibility to select one or more components for their load testing needs.</p><h2 id="3397" class="qy pl io bf pm gk qz dy gl gm ra ea gn go rb gp gq gr rc gs gt gu rd gv gw re bk">Load generator</h2><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox rf"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*WD4EWyWHDQMf_7nDIGkAuA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*WD4EWyWHDQMf_7nDIGkAuA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*WD4EWyWHDQMf_7nDIGkAuA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*WD4EWyWHDQMf_7nDIGkAuA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*WD4EWyWHDQMf_7nDIGkAuA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*WD4EWyWHDQMf_7nDIGkAuA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*WD4EWyWHDQMf_7nDIGkAuA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*WD4EWyWHDQMf_7nDIGkAuA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*WD4EWyWHDQMf_7nDIGkAuA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*WD4EWyWHDQMf_7nDIGkAuA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*WD4EWyWHDQMf_7nDIGkAuA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*WD4EWyWHDQMf_7nDIGkAuA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*WD4EWyWHDQMf_7nDIGkAuA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*WD4EWyWHDQMf_7nDIGkAuA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qu ff qv ow ox qw qx bf b bg ab du">Figure 2: Containerized load generator</figcaption></figure><p id="b182" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk"><em class="rg">Context aware</em></p><p id="b9cc" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">When load testing, requests made to the SUT often require some information from the previous response or need to be sent in a specific order. For example, if an update API needs to provide an <em class="rg">entity_id</em> to update, we must ensure the entity already exists in the testing environment context.</p><p id="0224" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Our load generator tool allows users to write arbitrary testing logic in Java or Kotlin and launch containers to run these tests at scale against the SUT. Why write code instead of DSL/configuration logic?</p><ul class=""><li id="9218" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov rh qm qn bk">Flexibility: Programming languages are more expressive than DSL and can better support complex contextual scenarios.</li><li id="28ad" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov rh qm qn bk">Reusability: The same testing code can be used in other tests, e.g., integration tests.</li><li id="d916" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov rh qm qn bk">Developer proficiency: Low/no learning curve to onboard, don’t need to learn how to write testing logic.</li><li id="91f5" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov rh qm qn bk">Developer experience: IDE support, testing, debugging, etc.</li></ul><p id="700e" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Here is an example of synthetic context-aware test case:</p><pre class="oz pa pb pc pd ri rj rk bp rl bb bk">class HelloWorldLoadGenerator : LoadGenerator {<br />   override suspend fun run() {<br />       val createdEntity = sutApiClient.create(CreateRequest(name="foo", ...)).data// request with id from previous response (context)<br />       val updateResponse = sutApiClient.update(UpdateRequest(id=createdEntity.id, name="bar"))// ... other operations// clean up<br />       sutApiClient.delete(DeleteRequest(id=createdEntity.id))<br />   }<br />}</pre><p id="eabe" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk"><em class="rg">Decentralized</em></p><p id="f86c" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">The load generator is decentralized and containerized, which means each time a load test is triggered, a set of new containers will be created to run the test. This design has several benefits:</p><ul class=""><li id="c1b5" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov rh qm qn bk">Isolation: Load testing runs between different services are isolated from each other, eliminating any interference.</li><li id="a85d" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov rh qm qn bk">Scalability: The number of containers can be scaled up or down according to the traffic requirements.</li><li id="6e5e" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov rh qm qn bk">Cost efficiency: The containers are short-lived, as they only exist during the load testing run.</li></ul><p id="86d0" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">What’s more, as our services are cloud based, a subtle point is that the Impulse framework will evenly distribute the workers among all our data centers, and the load will be emitted evenly from all the workers. Impulse’s load generator ensures the overall trigger per second (TPS) is as configured. Based on this, we can better leverage the locality settings in load balancers, which can better mimic the real traffic distribution in production.</p><p id="2d65" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk"><em class="rg">Execution</em></p><p id="f88d" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">The load generator is designed to be executed in the CI/CD pipeline, which means we can trigger load testing automatically. Developers can configure the testing spec in multiple phases, e.g., a warm up phase, a steady state phase, a peak phase, etc. Each phase can be configured with:</p><ul class=""><li id="4f2e" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov rh qm qn bk">Test cases to run</li><li id="a93b" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov rh qm qn bk">TPS (trigger per second) of each test case</li><li id="5318" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov rh qm qn bk">Test duration</li></ul><h2 id="bf73" class="qy pl io bf pm gk qz dy gl gm ra ea gn go rb gp gq gr rc gs gt gu rd gv gw re bk">Dependency mocker</h2><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox rr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*QXMa3Nj3-aSUE_EOvlv1xw.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*QXMa3Nj3-aSUE_EOvlv1xw.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*QXMa3Nj3-aSUE_EOvlv1xw.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*QXMa3Nj3-aSUE_EOvlv1xw.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*QXMa3Nj3-aSUE_EOvlv1xw.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*QXMa3Nj3-aSUE_EOvlv1xw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*QXMa3Nj3-aSUE_EOvlv1xw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*QXMa3Nj3-aSUE_EOvlv1xw.png 640w, https://miro.medium.com/v2/resize:fit:720/1*QXMa3Nj3-aSUE_EOvlv1xw.png 720w, https://miro.medium.com/v2/resize:fit:750/1*QXMa3Nj3-aSUE_EOvlv1xw.png 750w, https://miro.medium.com/v2/resize:fit:786/1*QXMa3Nj3-aSUE_EOvlv1xw.png 786w, https://miro.medium.com/v2/resize:fit:828/1*QXMa3Nj3-aSUE_EOvlv1xw.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*QXMa3Nj3-aSUE_EOvlv1xw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*QXMa3Nj3-aSUE_EOvlv1xw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qu ff qv ow ox qw qx bf b bg ab du">Figure 3: Dependency mocker</figcaption></figure><p id="9b6a" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Impulse is a decentralized framework where each service has its own dependency mocker. This can eliminate interference between services and reduce communication costs. Each dependency mocker is an out-of-process service, which means the SUT behaves just as it does in production. We run the mockers in separate instances to avoid any impact on the performance of the SUT. The mock servers are all short lived — they only start before tests run and shut down afterwards to save costs and maintenance effort. The response latency and exceptions are configurable and the number of mocker instances can be adjusted on demand to support large amounts of traffic.</p><p id="bcce" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Other noteworthy features:</p><ul class=""><li id="f1c1" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov rh qm qn bk">You can selectively stub some of the dependencies. Currently, stubbing is supported for HTTP JSON, Airbnb Thrift, and Airbnb GraphQL dependencies.</li><li id="42e8" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov rh qm qn bk">The dependency mockers support use cases beyond load testing. For instance, integration tests often rely on other services or third-party API calls, which may not guarantee a stable testing environment or might only support ideal scenarios. Dependency mockers can address this by offering predefined responses or exceptions to fully test those flows.</li></ul><p id="5c25" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Impulse supports two options for generating mock responses:</p><ol class=""><li id="29f3" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov ql qm qn bk">Synthetic response: The response is generated by user logic, as in integration testing; the difference is that the response comes from a remote (out-of-process) server with simulated latency.<br />- Similar to the load generator, the logic is written in Java/Kotlin code and contains request matching and response generation.<br />- Latency can be simulated using p95/p99 metrics.</li><li id="8fc6" class="ob oc io od b oe qo og oh oi qp ok ol go qq on oo gr qr oq or gu qs ot ou ov ql qm qn bk">Replay response: The response is replayed from the production downstream recording, supported by the traffic collector component.</li></ol><p id="adb4" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Here is an example of a synthetic response with latency in Kotlin:</p><pre class="oz pa pb pc pd ri rj rk bp rl bb bk">downstreamsMocking.every(<br />      thriftRequest&lt;FooRequest&gt;().having { it.message == "hello" }<br />    ).returns { request -&gt;<br />      ThriftDownstream.Response.thriftEncoded(<br />        HttpStatus.OK,<br />        FooResponse.builder.reply("${request.message} world").build()<br />      )<br />    }.with {<br />      delay = latencyFromP95(p95=500.miliseconds, min=200.miliseconds, max=2000.miliseconds)<br />    }</pre><h2 id="d412" class="qy pl io bf pm gk qz dy gl gm ra ea gn go rb gp gq gr rc gs gt gu rd gv gw re bk">Traffic collector</h2><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox rs"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*EImg3JUEGzbos5r3_6U-FQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*EImg3JUEGzbos5r3_6U-FQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*EImg3JUEGzbos5r3_6U-FQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*EImg3JUEGzbos5r3_6U-FQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*EImg3JUEGzbos5r3_6U-FQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*EImg3JUEGzbos5r3_6U-FQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*EImg3JUEGzbos5r3_6U-FQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*EImg3JUEGzbos5r3_6U-FQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*EImg3JUEGzbos5r3_6U-FQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*EImg3JUEGzbos5r3_6U-FQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*EImg3JUEGzbos5r3_6U-FQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*EImg3JUEGzbos5r3_6U-FQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*EImg3JUEGzbos5r3_6U-FQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*EImg3JUEGzbos5r3_6U-FQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qu ff qv ow ox qw qx bf b bg ab du">Figure 4: Traffic collector</figcaption></figure><p id="5c47" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">The traffic collector component is designed to capture both upstream and downstream traffic, along with the relationships between them. This approach allows Impulse to accurately replay production traffic during load testing, avoiding inconsistencies in downstream data or behavior. By replicating downstream responses — including production-like latency and errors — via the dependency mocker, the system ensures high-fidelity load testing. As a result, services in the testing environment behave identically to those in production, enabling more realistic and reliable performance evaluations.</p><h2 id="9575" class="qy pl io bf pm gk qz dy gl gm ra ea gn go rb gp gq gr rc gs gt gu rd gv gw re bk">Testing API generator</h2><p id="bb25" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">We rely heavily on event-driven, asynchronous workflows that are critical to our business operations. These include processing events from a message queue (MQ) and executing delayed jobs. Most of the MQ events/jobs are emitted from synchronous flows (e.g., API calls), so theoretically they can be covered by API load testing. However, the real world is more complex. These asynchronous flows often involve long chains of event and job emissions originating from various sources, making it difficult to replicate and test them accurately using only API-based methods.</p><p id="4e30" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">To address this, the testing API generator component creates HTTP APIs during the CI stage according to the event or job schema. These APIs act as wrappers around the underlying asynchronous flows and are registered exclusively in the testing environment. This setup enables load testing tools — such as load generators — to send traffic to these synthetic APIs, allowing asynchronous flows to be exercised as if they were synchronous. As a result, it’s possible to perform targeted, realistic load testing on asynchronous logic that would otherwise be hard to simulate.</p><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div class="ow ox rt"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*F3qllm7qqMu4N2k0bBFbdQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*F3qllm7qqMu4N2k0bBFbdQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*F3qllm7qqMu4N2k0bBFbdQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*F3qllm7qqMu4N2k0bBFbdQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*F3qllm7qqMu4N2k0bBFbdQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*F3qllm7qqMu4N2k0bBFbdQ.png 1100w, https://miro.medium.com/v2/resize:fit:1320/format:webp/1*F3qllm7qqMu4N2k0bBFbdQ.png 1320w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 660px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*F3qllm7qqMu4N2k0bBFbdQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*F3qllm7qqMu4N2k0bBFbdQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*F3qllm7qqMu4N2k0bBFbdQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*F3qllm7qqMu4N2k0bBFbdQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*F3qllm7qqMu4N2k0bBFbdQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*F3qllm7qqMu4N2k0bBFbdQ.png 1100w, https://miro.medium.com/v2/resize:fit:1320/1*F3qllm7qqMu4N2k0bBFbdQ.png 1320w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 660px" /></picture></div><figcaption class="qu ff qv ow ox qw qx bf b bg ab du">Figure 5: Testing API generator for async flows</figcaption></figure><p id="e448" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">The goal of the testing API generator is to help developers identify performance bottlenecks and potential issues in their async flow implementations and under high traffic conditions. It does this by enabling direct load testing of async flows without involving middleware components like MQs. The rationale is that developers typically aim to evaluate the behavior of their own logic, not the middleware, which is usually already well-tested. By bypassing these components, this approach simplifies the load testing process and empowers developers to independently manage and execute their own tests.</p><h2 id="6ee9" class="qy pl io bf pm gk qz dy gl gm ra ea gn go rb gp gq gr rc gs gt gu rd gv gw re bk">Integration with other testing frameworks</h2><p id="29bc" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">Airbnb emphasizes product quality, utilizing versatile testing frameworks that cover integration and API tests across development, staging, and production environments, and integrate smoothly into CI/CD pipelines. The modular design of Impulse facilitates its integration with these frameworks, offering systematic service testing.</p><figure class="oz pa pb pc pd pe ow ox paragraph-image"><div role="button" tabindex="0" class="pf pg fl ph bh pi"><div class="ow ox ru"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*649CFxbpASxHotVVbqdQkQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*649CFxbpASxHotVVbqdQkQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*649CFxbpASxHotVVbqdQkQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*649CFxbpASxHotVVbqdQkQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*649CFxbpASxHotVVbqdQkQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*649CFxbpASxHotVVbqdQkQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*649CFxbpASxHotVVbqdQkQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*649CFxbpASxHotVVbqdQkQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*649CFxbpASxHotVVbqdQkQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*649CFxbpASxHotVVbqdQkQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*649CFxbpASxHotVVbqdQkQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*649CFxbpASxHotVVbqdQkQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*649CFxbpASxHotVVbqdQkQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*649CFxbpASxHotVVbqdQkQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qu ff qv ow ox qw qx bf b bg ab du">Figure 6: How Impulse interfaces with other internal testing frameworks</figcaption></figure><h1 id="649a" class="pk pl io bf pm pn po pp gl pq pr ps gn pt pu pv pw px py pz qa qb qc qd qe qf bk">Conclusion</h1><p id="9a29" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">In this blog post, we shared how Impulse and its four core components help developers perform self-service load testing at Airbnb. As of this writing, Impulse has been implemented in several customer support backend services and is currently under review with different teams across the company who are planning to leverage Impulse to conduct load testing.</p><p id="9b09" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">We’ve received a lot of good feedback in the process. For example: “<em class="rg">Impulse helps us to identify and address potential issues in our service. During testing, it detected an ApiClientThreadToolExhaustionException caused by thread pool pressure. Additionally, it alerted us about occasional timeout errors in client API calls during service deployments. Impulse helped us identify high memory usage in the main service container, enabling us to fine-tune the memory allocation and optimize our service’s resource usage. Highly recommend utilizing Impulse as an integral part of the development and testing processes.</em>”</p><h1 id="7f23" class="pk pl io bf pm pn po pp gl pq pr ps gn pt pu pv pw px py pz qa qb qc qd qe qf bk">Acknowledgments</h1><p id="e6c7" class="pw-post-body-paragraph ob oc io od b oe qg og oh oi qh ok ol go qi on oo gr qj oq or gu qk ot ou ov hq bk">Thanks to Jeremy Werner, Yashar Mehdad, Raj Rajagopal, Claire Cheng, Tim L., Wei Ji, Jay Wu, Brian Wallace for support on the Impulse project.</p><p id="41d1" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Does this type of work interest you? Check out our open roles <a class="ag hb" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">here</a>.</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/load-testing-with-impulse-at-airbnb-f466874d03d2</link>
      <guid>https://medium.com/airbnb-engineering/load-testing-with-impulse-at-airbnb-f466874d03d2</guid>
      <pubDate>Mon, 09 Jun 2025 19:45:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Listening, Learning, and Helping at Scale: How Machine Learning Transforms Airbnb’s Voice Support…]]></title>
      <description><![CDATA[<div><div></div><p id="f2ec" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">A look into how Airbnb uses speech recognition, intent detection, and language models to understand users and assist agents more effectively.</p><p id="0fc0" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk"><em class="ow">By </em><a class="ag hb" href="https://www.linkedin.com/in/yuanpei-cao-792b103b/" rel="noopener ugc nofollow" target="_blank"><em class="ow">Yuanpei Cao</em></a><em class="ow">, </em><a class="ag hb" href="https://www.linkedin.com/in/heng-j-1a44a711/" rel="noopener ugc nofollow" target="_blank"><em class="ow">H</em>eng Ji</a><em class="ow">, </em><a class="ag hb" href="https://www.linkedin.com/in/elaineliu5/" rel="noopener ugc nofollow" target="_blank"><em class="ow">Elaine Liu</em></a><em class="ow">, </em><a class="ag hb" href="https://www.linkedin.com/in/peng-wang-13117371/" rel="noopener ugc nofollow" target="_blank"><em class="ow">Peng Wang</em></a><em class="ow">, and </em><a class="ag hb" href="https://www.linkedin.com/in/tiantian-zhang-a4208726/" rel="noopener ugc nofollow" target="_blank"><em class="ow">Tiantian Zhang</em></a></p><figure class="pa pb pc pd pe pf ox oy paragraph-image"><div role="button" tabindex="0" class="pg ph fl pi bh pj"><div class="ox oy oz"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*zyT9hDwGkSCvZ-wKEx669w.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="0383" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">At Airbnb, we aim to provide a smooth, intuitive, and helpful community support experience, whether it’s helping a guest navigate a booking change or helping a host with a listing issue. While our Help Center and customer support chatbot helps resolve many inquiries efficiently, some users prefer the immediacy of a voice conversation with a support representative. To make these interactions faster and more effective, we’ve significantly improved our Interactive Voice Response (IVR) system via machine learning.</p><p id="d0a7" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Over the years, Airbnb has invested in conversational AI to enhance customer support. In our previous blog posts <a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/task-oriented-conversational-ai-in-airbnb-customer-support-5ebf49169eaa" data-discover="true"><em class="ow">Task-Oriented Conversational AI in Airbnb Customer Support</em></a> and<a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/using-chatbots-to-provide-faster-covid-19-community-support-567c97c5c1c9" data-discover="true"> <em class="ow">Using Chatbots to Provide Faster COVID-19 Community Support</em></a>, we explored how AI-driven chatbots streamline guest and host interactions through automated messaging. This post explains how we extend that work to voice-based support, leveraging machine learning to improve real-time phone interactions with our intelligent IVR system.</p><p id="f521" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">We’ll take you through the end-to-end IVR journey, the key machine learning components that power it, and how we designed a system that delivers faster, more human-like, and more intuitive voice support for our community.</p><h1 id="25ca" class="pl pm io bf pn po pp pq gl pr ps pt gn pu pv pw px py pz qa qb qc qd qe qf qg bk">Reimagining the voice support journey</h1><p id="9f76" class="pw-post-body-paragraph ob oc io od b oe qh og oh oi qi ok ol go qj on oo gr qk oq or gu ql ot ou ov hq bk">Traditional IVR systems often rely on rigid menu trees, requiring callers to press buttons and navigate pre-set paths. Instead, we designed an adaptive, conversational IVR that listens, understands, and responds in real time. Here’s normally what happens when a caller reaches out to Airbnb support:</p><ol class=""><li id="36e1" class="ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov qm qn qo bk"><strong class="od ip">Call and greeting: </strong>IVR picks up and prompts, <em class="ow">“In a few sentences, please tell us why you’re calling today.”</em></li><li id="037f" class="ob oc io od b oe qp og oh oi qq ok ol go qr on oo gr qs oq or gu qt ot ou ov qm qn qo bk"><strong class="od ip">Automated speech recognition (ASR):</strong> The caller’s response is transcribed with Airbnb-specific ASR. For example, if a caller says, <em class="ow">“I need to request a refund for my reservation,”</em> ASR accurately converts this speech into text, preserving key domain-specific terms.</li><li id="5e26" class="ob oc io od b oe qp og oh oi qq ok ol go qr on oo gr qs oq or gu qt ot ou ov qm qn qo bk"><strong class="od ip">Understanding intent:</strong> A Contact Reason Detection model classifies the issue into a category like cancellations, refunds, account issues, etc.</li><li id="fc74" class="ob oc io od b oe qp og oh oi qq ok ol go qr on oo gr qs oq or gu qt ot ou ov qm qn qo bk"><strong class="od ip">Decision-making:</strong> If self-service is possible, the system retrieves and sends a relevant help article or an intelligent workflow via SMS or app notification. If the caller explicitly requests agent support or the issue requires human intervention, the call is routed to a customer support agent with relevant details attached.</li><li id="b538" class="ob oc io od b oe qp og oh oi qq ok ol go qr on oo gr qs oq or gu qt ot ou ov qm qn qo bk"><strong class="od ip">Clarifying response:</strong> A Paraphrasing model generates a summary of the user intent, which IVR shares with the user before delivering the solution. This ensures that users understand the context of the resource they receive. Continuing our example, the system would respond, “<em class="ow">I understand your issue is regarding a refund request.</em> <em class="ow">We have sent you a link to resources about this topic. Follow the instructions to find answers. If you need to speak with an agent, press 0 to be connected to our customer service representative.</em>” The underscored Paraphrasing component enhances engagement by bridging the gap between system-generated responses and user comprehension, making the self-service experience more intuitive.</li><li id="1c0e" class="ob oc io od b oe qp og oh oi qq ok ol go qr on oo gr qs oq or gu qt ot ou ov qm qn qo bk"><strong class="od ip">Resolution or escalation:</strong> The caller receives an SMS or app notification with a direct link to a relevant <a class="ag hb" href="https://www.airbnb.com/help" rel="noopener ugc nofollow" target="_blank">Airbnb Help Center</a> article. If further assistance is needed, they can press 0 to connect with a customer service representative.</li></ol><p id="cfcd" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">By moving away from rigid menus to natural language understanding, we allow guests and hosts to express their issues in their own words, helping to increase satisfaction and resolution efficiency.</p><figure class="pa pb pc pd pe pf ox oy paragraph-image"><div role="button" tabindex="0" class="pg ph fl pi bh pj"><div class="ox oy qu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*4MMPeZJDlFDELNNSZ8dGPA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*4MMPeZJDlFDELNNSZ8dGPA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*4MMPeZJDlFDELNNSZ8dGPA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*4MMPeZJDlFDELNNSZ8dGPA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*4MMPeZJDlFDELNNSZ8dGPA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*4MMPeZJDlFDELNNSZ8dGPA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*4MMPeZJDlFDELNNSZ8dGPA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*4MMPeZJDlFDELNNSZ8dGPA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*4MMPeZJDlFDELNNSZ8dGPA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*4MMPeZJDlFDELNNSZ8dGPA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*4MMPeZJDlFDELNNSZ8dGPA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*4MMPeZJDlFDELNNSZ8dGPA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*4MMPeZJDlFDELNNSZ8dGPA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*4MMPeZJDlFDELNNSZ8dGPA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qv ff qw ox oy qx qy bf b bg ab du">Figure 1: High-level architecture of how Airbnb IVR Core Service interacts with core machine learning components to resolve user issues over the phone.</figcaption></figure><h1 id="d9a8" class="pl pm io bf pn po pp pq gl pr ps pt gn pu pv pw px py pz qa qb qc qd qe qf qg bk">Breaking down our ML-powered IVR system</h1><h2 id="bbf2" class="qz pm io bf pn gk ra dy gl gm rb ea gn go rc gp gq gr rd gs gt gu re gv gw rf bk">1. Automated speech recognition (ASR): transcribing with precision</h2><p id="8b18" class="pw-post-body-paragraph ob oc io od b oe qh og oh oi qi ok ol go qj on oo gr qk oq or gu ql ot ou ov hq bk">In a voice-driven support system, achieving high transcription accuracy is essential, particularly in noisy phone environments where speech can be unclear. General speech recognition models often struggle with Airbnb-specific terminology, leading to errors like misinterpreting “listing” as “lifting” or “help with my stay” as “happy Christmas Day.” These inaccuracies create challenges in understanding user intent and impact downstream processes.</p><p id="59d4" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">To enhance ASR accuracy, we transitioned from a generic high-quality pretrained model to one specifically adapted for noisy phone audio. Additionally, we introduced a domain-specific phrase list optimization that ensures Airbnb terms are properly recognized. Based on a sample of hundreds of clips, this significantly <strong class="od ip">reduced the word error rate (WER) from 33% to approximately 10%</strong>. The reduced WER significantly enhanced the accuracy of downstream help article recommendations, increasing user engagement, improving customer NPS among users who interacted with the ASR menu, while reducing reliance on human agents and lowering customer service handling time.</p><h2 id="9c8f" class="qz pm io bf pn gk ra dy gl gm rb ea gn go rc gp gq gr rd gs gt gu re gv gw rf bk">2. Contact Reason prediction: understanding the why</h2><p id="b3c9" class="pw-post-body-paragraph ob oc io od b oe qh og oh oi qi ok ol go qj on oo gr qk oq or gu ql ot ou ov hq bk">After transcribing the caller’s statements, the next step involves identifying their intent. We accomplished this by creating a detailed Contact Reason taxonomy that categorizes all potential Airbnb inquiries, as elaborated in “<a class="ag hb" rel="noopener" href="https://medium.com/airbnb-engineering/t-leaf-taxonomy-learning-and-evaluation-framework-30ae19ce8c52" data-discover="true">T-LEAF: Taxonomy Learning and EvaluAtion Framework</a>.” We then use an intent detection model to classify calls into a Contact Reason category, ensuring each inquiry is handled appropriately. For example, if a caller mentions “I haven’t received my refund yet,” the model predicts the Contact Reason as Missing Refund and forwards it to the relevant downstream components.</p><p id="d372" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">In production, we deploy the Issue Detection Service to host the intent detection models, running them in parallel to achieve optimal scalability, flexibility, and efficiency. Parallel computing ensures that intent detection <strong class="od ip">latency remains under 50ms on average</strong>, making the process imperceptible to IVR users and ensuring a seamless real-time experience. The detected intent is then analyzed within the IVR workflow to determine the next action, whether it’s guiding the user through a self-service resolution or escalating directly to a human agent.</p><p id="771b" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Occasionally, callers prefer to speak directly with a human agent instead of describing their issues, using terms like “agent” or “escalation.” For such scenarios, we use a different intent detection model to recognize when a caller wants to escalate to a human agent. If this intent is detected, the IVR system honors the caller’s request and routes the call to the suitable support team.</p><figure class="pa pb pc pd pe pf ox oy paragraph-image"><div role="button" tabindex="0" class="pg ph fl pi bh pj"><div class="ox oy rg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*oj7NCOlBGYOnNBOx 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*oj7NCOlBGYOnNBOx 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*oj7NCOlBGYOnNBOx 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*oj7NCOlBGYOnNBOx 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*oj7NCOlBGYOnNBOx 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*oj7NCOlBGYOnNBOx 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*oj7NCOlBGYOnNBOx 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*oj7NCOlBGYOnNBOx 640w, https://miro.medium.com/v2/resize:fit:720/0*oj7NCOlBGYOnNBOx 720w, https://miro.medium.com/v2/resize:fit:750/0*oj7NCOlBGYOnNBOx 750w, https://miro.medium.com/v2/resize:fit:786/0*oj7NCOlBGYOnNBOx 786w, https://miro.medium.com/v2/resize:fit:828/0*oj7NCOlBGYOnNBOx 828w, https://miro.medium.com/v2/resize:fit:1100/0*oj7NCOlBGYOnNBOx 1100w, https://miro.medium.com/v2/resize:fit:1400/0*oj7NCOlBGYOnNBOx 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qv ff qw ox oy qx qy bf b bg ab du">Figure 2. Intent detection architecture and Issue Detection Service.</figcaption></figure><h2 id="07e0" class="qz pm io bf pn gk ra dy gl gm rb ea gn go rc gp gq gr rd gs gt gu re gv gw rf bk">3. Help article retrieval: delivering the right information</h2><p id="643a" class="pw-post-body-paragraph ob oc io od b oe qh og oh oi qi ok ol go qj on oo gr qk oq or gu ql ot ou ov hq bk">Many common Airbnb issues can be quickly resolved by providing clear and relevant educational information. To help provide useful information to users and minimize the need for human customer support, we use the Help Article Retrieval and Ranking system. This advanced system automatically identifies the issue in a user’s inquiry and delivers the most relevant help article link via SMS text message and Airbnb app notification. Our process incorporates two machine learning stages.</p><p id="0759" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk"><strong class="od ip">Semantic retrieval and ranking:</strong> We index Airbnb Help Article embeddings into a vector database, enabling efficient retrieval of up to 30 relevant articles per user query using cosine similarity, typically within 60ms. An LLM-based ranking model then re-ranks these retrieved articles, with the top-ranked article directly presented to users via IVR channels. This dual-stage system not only powers IVR interactions but also supports our customer support chatbot and Help Center search. Across these platforms, its effectiveness is continuously evaluated using metrics like Precision@N, facilitating ongoing improvements and refinements.</p><figure class="pa pb pc pd pe pf ox oy paragraph-image"><div role="button" tabindex="0" class="pg ph fl pi bh pj"><div class="ox oy rh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*wlkl-CT0czyGqNdKx-ffIw.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*wlkl-CT0czyGqNdKx-ffIw.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*wlkl-CT0czyGqNdKx-ffIw.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*wlkl-CT0czyGqNdKx-ffIw.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*wlkl-CT0czyGqNdKx-ffIw.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*wlkl-CT0czyGqNdKx-ffIw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*wlkl-CT0czyGqNdKx-ffIw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*wlkl-CT0czyGqNdKx-ffIw.png 640w, https://miro.medium.com/v2/resize:fit:720/1*wlkl-CT0czyGqNdKx-ffIw.png 720w, https://miro.medium.com/v2/resize:fit:750/1*wlkl-CT0czyGqNdKx-ffIw.png 750w, https://miro.medium.com/v2/resize:fit:786/1*wlkl-CT0czyGqNdKx-ffIw.png 786w, https://miro.medium.com/v2/resize:fit:828/1*wlkl-CT0czyGqNdKx-ffIw.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*wlkl-CT0czyGqNdKx-ffIw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*wlkl-CT0czyGqNdKx-ffIw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qv ff qw ox oy qx qy bf b bg ab du">Figure 3. Architecture diagram for the Help Article Retrieval and Ranking system.</figcaption></figure><h2 id="a072" class="qz pm io bf pn gk ra dy gl gm rb ea gn go rc gp gq gr rd gs gt gu re gv gw rf bk">4. Paraphrasing model: enhancing user understanding</h2><p id="c8ac" class="pw-post-body-paragraph ob oc io od b oe qh og oh oi qi ok ol go qj on oo gr qk oq or gu ql ot ou ov hq bk">A key challenge in IVR-based customer support is ensuring users clearly understand the resolution before receiving help article links, as they typically lack visibility into the article’s contents or title. To address this, we implemented a lightweight paraphrasing approach leveraging a curated set of standardized summaries.</p><p id="f500" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">UX writers created concise and clear paraphrases for common Airbnb scenarios. During online serving, user inquiries are mapped to these curated summaries via nearest-neighbor matching based on text embedding similarity. We calibrated a similarity threshold to ensure high-quality matches. Manual evaluation of end-to-end model outputs confirmed precision exceeding 90%.</p><p id="b603" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">The outcome was a finite-state solution delivering the most appropriate paraphrased IVR prompt before presenting a help article link. For example, if a caller states, “I need to cancel my reservation and request a refund,” the model generates a response like “I understand your issue is about a refund request” before sending the retrieved help article link.</p><p id="e3d2" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Integrating this model ensures users receive clear, contextually relevant summaries prior to accessing help articles. In an experiment targeting English hosts who contacted customer support, we found that presenting a paraphrased summary before sending the article link increases user engagement with article content, resulting in improvement in self-resolution rates, helping to reduce the need for direct customer support assistance.</p><h1 id="3258" class="pl pm io bf pn po pp pq gl pr ps pt gn pu pv pw px py pz qa qb qc qd qe qf qg bk">Conclusion</h1><p id="ec28" class="pw-post-body-paragraph ob oc io od b oe qh og oh oi qi ok ol go qj on oo gr qk oq or gu ql ot ou ov hq bk">By combining Automated Speech Recognition and Contact Reason Detection systems with a help article retrieval system, and a paraphrasing model, we have created an IVR system that streamlines support interactions and improves user satisfaction. Our solution enables callers to describe issues naturally, reduces dependency on human agents for common inquiries, and provides instant, relevant support through self-service. When human assistance is necessary, the system ensures a smooth transition by routing users to the right agent with essential context.</p><p id="34b4" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Interested in working at Airbnb? Check out our <a class="ag hb" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">open roles</a>.</p><h1 id="60aa" class="pl pm io bf pn po pp pq gl pr ps pt gn pu pv pw px py pz qa qb qc qd qe qf qg bk"><strong class="am">Acknowledgements</strong></h1><p id="bb61" class="pw-post-body-paragraph ob oc io od b oe qh og oh oi qi ok ol go qj on oo gr qk oq or gu ql ot ou ov hq bk">Thanks to Zhenyu Zhao, Mia Zhao, Wayne Zhang, Lucca Siaudzionis, Lulu Chen, Sukey Xu, Floria Wan, Michael Zhou, Can Yang, Yaolin Chen, Shuaihu Wang, Huifan Qu, Ming Shang,Yu Jiang, Wanting Chen, Elena Zhao, Shanna Su, Cassie Cao, Hao Wang, Haoran Zhu, Xirui Liu, Ying Tan, Xiaohan Zeng, Xiaoyu Meng, Gavin Li, Gaurav Rai, Hemanth Kolla, Ihor Hordiienko, Matheus Scharf, and Stepan Sydoruk who helped bring this vision to life. Also thanks to Paige Schwartz, Stephanie Chu, Neal Cohen, Becky Ajuonuma, Iman Saleh, Dani Normanm, Javier Salido, and Lauren Mackevich for the review and editing.</p><p id="90b8" class="pw-post-body-paragraph ob oc io od b oe of og oh oi oj ok ol go om on oo gr op oq or gu os ot ou ov hq bk">Thanks to Jeremy Werner, Joy Zhang, Claire Cheng, Yashar Mehdad, Shuohao Zhang, Shawn Yan, Kelvin Xiong, Michael Lubavin, Teng Wang, Wei Ji, and Chenhao Yang’s leadership support on building conversational AI products at Airbnb.</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/listening-learning-and-helping-at-scale-how-machine-learning-transforms-airbnbs-voice-support-b71f912d4760</link>
      <guid>https://medium.com/airbnb-engineering/listening-learning-and-helping-at-scale-how-machine-learning-transforms-airbnbs-voice-support-b71f912d4760</guid>
      <pubDate>Thu, 29 May 2025 19:29:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[How Airbnb Measures Listing Lifetime Value]]></title>
      <description><![CDATA[<div class="jf im jg jh ji"><div class="ab de"><div class="dl bh is it iu iv"><div><div></div><p id="e1ce" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">A deep dive on the framework that lets us identify the most valuable listings for our guests.</p><p id="3920" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk"><strong class="ou jm">By:</strong> <a class="ag hz" href="https://www.linkedin.com/in/carlossanchezmartinez/" rel="noopener ugc nofollow" target="_blank">Carlos Sanchez-Martinez</a>, <a class="ag hz" href="https://www.linkedin.com/in/seanmk2/" rel="noopener ugc nofollow" target="_blank">Sean O’Donnell</a>, <a class="ag hz" href="https://www.linkedin.com/in/lohua-yuan/" rel="noopener ugc nofollow" target="_blank">Lo-Hua Yuan</a>, <a class="ag hz" href="https://www.linkedin.com/in/yunshanz/" rel="noopener ugc nofollow" target="_blank">Yunshan Zhu</a></p><figure class="pq pr ps pt pu pv pn po paragraph-image"><div role="button" tabindex="0" class="pw px gk py bh pz"><div class="pn po pp"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*JSoY6CDkTMQEFXgP 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*JSoY6CDkTMQEFXgP 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*JSoY6CDkTMQEFXgP 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*JSoY6CDkTMQEFXgP 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*JSoY6CDkTMQEFXgP 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*JSoY6CDkTMQEFXgP 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*JSoY6CDkTMQEFXgP 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*JSoY6CDkTMQEFXgP 640w, https://miro.medium.com/v2/resize:fit:720/0*JSoY6CDkTMQEFXgP 720w, https://miro.medium.com/v2/resize:fit:750/0*JSoY6CDkTMQEFXgP 750w, https://miro.medium.com/v2/resize:fit:786/0*JSoY6CDkTMQEFXgP 786w, https://miro.medium.com/v2/resize:fit:828/0*JSoY6CDkTMQEFXgP 828w, https://miro.medium.com/v2/resize:fit:1100/0*JSoY6CDkTMQEFXgP 1100w, https://miro.medium.com/v2/resize:fit:1400/0*JSoY6CDkTMQEFXgP 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="07f1" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">At Airbnb, we always strive to provide our community with the best experience. To do so, it’s important to understand what kinds of accommodation listings are valuable to our guests. We achieve this by calculating and using estimates of <strong class="ou jm">listing lifetime value</strong>. These estimates not only allow us to identify which types of listings resonate best with guests, but also help us develop resources and recommendations for hosts to increase the value driven by their listings.</p><p id="e032" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">Most of the existing literature on lifetime value focuses on traditional sales channels in which a single seller transacts with many buyers (e.g. a retailer selling clothing to a customer). In contrast, this blog post explains how we model lifetime value in a platform like Airbnb, with multiple sellers and buyers. In the first section, we describe our general listings lifetime value framework. In the second section, we discuss relevant challenges when putting this framework into practice.</p><h1 id="f48c" class="qb qc jl bf qd qe qf qg hh qh qi qj hj qk ql qm qn qo qp qq qr qs qt qu qv qw bk">Our Listing Lifetime Value Framework</h1><p id="9f49" class="pw-post-body-paragraph os ot jl ou b ov qx ox oy oz qy pb pc hk qz pe pf hn ra ph pi hq rb pk pl pm jf bk">Our listing lifetime value (LTV) framework estimates three different quantities of interest: baseline LTV, incremental LTV, and marketing-induced incremental LTV.</p><h2 id="6eb9" class="rc qc jl bf qd hg rd fa hh hi re fc hj hk rf hl hm hn rg ho hp hq rh hr hs ri bk">(1) Baseline LTV</h2><p id="55b4" class="pw-post-body-paragraph os ot jl ou b ov qx ox oy oz qy pb pc hk qz pe pf hn ra ph pi hq rb pk pl pm jf bk">To measure LTV, we need to define what we mean by “value” and what time horizon constitutes a “lifetime.” Simplifying slightly for the purposes of this blog post, we define and estimate our baseline listing LTV as the total number of bookings that a listing will make on Airbnb over the next 365 days.</p><p id="61fb" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">We rely on machine learning and the rich information we have about our listings to estimate this quantity for each individual listing. In practice, we also follow financial guidance to arrive at present value by projecting outcomes into the future and applying a relevant discount rate to future value.</p><p id="c758" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">Table 1 shows some hypothetical baseline LTV estimates. As you can see from the examples, LTV is not static, and can evolve as we improve the accuracy of our estimates, observe changes in our marketplace, or even develop a listing (e.g., by providing guidance that helps hosts improve the listing to get more bookings).</p><figure class="pq pr ps pt pu pv pn po paragraph-image"><div role="button" tabindex="0" class="pw px gk py bh pz"><div class="pn po rj"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*PLgiegXaNpY8nthDFZpmpA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*PLgiegXaNpY8nthDFZpmpA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*PLgiegXaNpY8nthDFZpmpA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*PLgiegXaNpY8nthDFZpmpA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*PLgiegXaNpY8nthDFZpmpA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*PLgiegXaNpY8nthDFZpmpA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*PLgiegXaNpY8nthDFZpmpA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*PLgiegXaNpY8nthDFZpmpA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*PLgiegXaNpY8nthDFZpmpA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*PLgiegXaNpY8nthDFZpmpA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*PLgiegXaNpY8nthDFZpmpA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*PLgiegXaNpY8nthDFZpmpA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*PLgiegXaNpY8nthDFZpmpA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*PLgiegXaNpY8nthDFZpmpA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rk gg rl pn po rm rn bf b bg z cm"><strong class="bf qd">Table 1. Example Listing LTV Estimates</strong></figcaption></figure><p id="70dd" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">We use baseline LTV estimates to segment our listings and identify which types of listings resonate best with our guests. This informs our supply expansion strategy. We also use baseline LTV to identify listings that are not expected to reach their full booking potential and may benefit from additional guidance.</p><h2 id="da64" class="rc qc jl bf qd hg rd fa hh hi re fc hj hk rf hl hm hn rg ho hp hq rh hr hs ri bk">(2) Incremental LTV</h2><p id="7841" class="pw-post-body-paragraph os ot jl ou b ov qx ox oy oz qy pb pc hk qz pe pf hn ra ph pi hq rb pk pl pm jf bk">When estimating lifetime value, we face a challenge that is common across multi-sided marketplaces: the transactions made by one listing might come at the expense of another listing’s transactions. For example, when a new listing joins our marketplace, this listing will get some bookings from guests who were previously booking other listings. We need to account for this dynamic if we want to accurately measure how much value is <em class="ro">added</em> by each listing.</p><p id="7829" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">We address this challenge by creating “incrementalLTV” estimates. We refer to the additional transactions that would not have occurred without the listing’s participation as “incremental value,” and the transactions that would have occurred even without the listing’s participation as “cannibalized value.” We estimate the incremental LTV for a listing by subtracting cannibalized value estimates from the baseline LTV. We explain this adjustment in more detail when discussing measurement challenges.</p><figure class="pq pr ps pt pu pv pn po paragraph-image"><div role="button" tabindex="0" class="pw px gk py bh pz"><div class="pn po rp"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*UH0hKFiaFYB-l_LL-CkSRQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*UH0hKFiaFYB-l_LL-CkSRQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*UH0hKFiaFYB-l_LL-CkSRQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*UH0hKFiaFYB-l_LL-CkSRQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*UH0hKFiaFYB-l_LL-CkSRQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*UH0hKFiaFYB-l_LL-CkSRQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*UH0hKFiaFYB-l_LL-CkSRQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*UH0hKFiaFYB-l_LL-CkSRQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*UH0hKFiaFYB-l_LL-CkSRQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*UH0hKFiaFYB-l_LL-CkSRQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*UH0hKFiaFYB-l_LL-CkSRQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*UH0hKFiaFYB-l_LL-CkSRQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*UH0hKFiaFYB-l_LL-CkSRQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*UH0hKFiaFYB-l_LL-CkSRQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rk gg rl pn po rm rn bf b bg z cm"><strong class="bf qd">Figure 1. Cannibalization.</strong> In this context, cannibalization refers to the transactions that would have occurred even without a listing’s participation in the marketplace. For example, when a new listing joins the platform, some bookings obtained by that listing would have been made at other listings on the platform had the new listing not joined.</figcaption></figure><h2 id="94de" class="rc qc jl bf qd hg rd fa hh hi re fc hj hk rf hl hm hn rg ho hp hq rh hr hs ri bk">(3) Marketing-induced incremental LTV</h2><p id="3c5c" class="pw-post-body-paragraph os ot jl ou b ov qx ox oy oz qy pb pc hk qz pe pf hn ra ph pi hq rb pk pl pm jf bk">Lifetime value is not static, and our LTV model needs to tell us how our internal initiatives bring additional listing value. For example, suppose we run a marketing campaign that provides hosts with tips on how to successfully improve their listings. To understand the return from the campaign, we need to measure how much value is accrued due to the campaign, and how much value would have been organically accrued without our marketing intervention. We calculate “marketing-induced incremental LTV” to measure how much additional listing LTV is created by our internal initiatives.</p><p id="6f1d" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">Having outlined our measurement framework (summarized in Figure 2), we now cover some of the technical challenges we faced when putting this framework into practice.</p><figure class="pq pr ps pt pu pv pn po paragraph-image"><div role="button" tabindex="0" class="pw px gk py bh pz"><div class="pn po rq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*RIUqYmgP_5JWfAohtdCdBQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*RIUqYmgP_5JWfAohtdCdBQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*RIUqYmgP_5JWfAohtdCdBQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*RIUqYmgP_5JWfAohtdCdBQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*RIUqYmgP_5JWfAohtdCdBQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*RIUqYmgP_5JWfAohtdCdBQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*RIUqYmgP_5JWfAohtdCdBQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*RIUqYmgP_5JWfAohtdCdBQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*RIUqYmgP_5JWfAohtdCdBQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*RIUqYmgP_5JWfAohtdCdBQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*RIUqYmgP_5JWfAohtdCdBQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*RIUqYmgP_5JWfAohtdCdBQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*RIUqYmgP_5JWfAohtdCdBQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*RIUqYmgP_5JWfAohtdCdBQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rk gg rl pn po rm rn bf b bg z cm"><strong class="bf qd">Figure 2.</strong> <strong class="bf qd">Listing LTV Framework</strong></figcaption></figure><h1 id="8924" class="qb qc jl bf qd qe qf qg hh qh qi qj hj qk ql qm qn qo qp qq qr qs qt qu qv qw bk">Challenges when measuring Listing Lifetime Value</h1><h2 id="7ca0" class="rc qc jl bf qd hg rd fa hh hi re fc hj hk rf hl hm hn rg ho hp hq rh hr hs ri bk">Challenge (1): Accurately measuring baseline LTV</h2><p id="a009" class="pw-post-body-paragraph os ot jl ou b ov qx ox oy oz qy pb pc hk qz pe pf hn ra ph pi hq rb pk pl pm jf bk">The most important requirement for our framework is accurate estimation of baseline LTV. Figure 3 illustrates our estimation setup. First, we leverage listing features snapshotted at estimation time t. This data includes rich knowledge we have about each listing and host (availability, price, location, host tenure, etc). We then use these features to train our machine learning model. As a value label, we use the number of bookings made within the next 365-day period, which is observed on date t + 365.</p><figure class="pq pr ps pt pu pv pn po paragraph-image"><div role="button" tabindex="0" class="pw px gk py bh pz"><div class="pn po rr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*6EanAK-Y42jbcWyATva8GA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*6EanAK-Y42jbcWyATva8GA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*6EanAK-Y42jbcWyATva8GA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*6EanAK-Y42jbcWyATva8GA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*6EanAK-Y42jbcWyATva8GA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*6EanAK-Y42jbcWyATva8GA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*6EanAK-Y42jbcWyATva8GA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*6EanAK-Y42jbcWyATva8GA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*6EanAK-Y42jbcWyATva8GA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*6EanAK-Y42jbcWyATva8GA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*6EanAK-Y42jbcWyATva8GA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*6EanAK-Y42jbcWyATva8GA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*6EanAK-Y42jbcWyATva8GA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*6EanAK-Y42jbcWyATva8GA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rk gg rl pn po rm rn bf b bg z cm"><strong class="bf qd">Figure 3. Label vs. Feature Collection. </strong>Our label lands 365 days after we collect the initial set of features for our model.</figcaption></figure><p id="2b7a" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">This setup has two important implications that impact accuracy and evaluation:</p><ul class=""><li id="1ba1" class="os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm rs rt ru bk">We have to wait 365 days to fully evaluate the accuracy of a prediction.</li><li id="56fb" class="os ot jl ou b ov rv ox oy oz rw pb pc hk rx pe pf hn ry ph pi hq rz pk pl pm rs rt ru bk">Our initial training data might not allow us to make accurate predictions if we observe shocks between the time when the training data was captured, and the time when we score the model.</li></ul><p id="cd90" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">In practice, we felt the full consequences of these implications during the COVID-19 pandemic, when travel came to a halt and marketplace dynamics changed drastically. Our model’s training data from before the pandemic had dramatically different characteristics relative to the scoring data we collected after the pandemic. When dealing with this shock, we implemented various strategies that helped us improve model accuracy:</p><ul class=""><li id="1f31" class="os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm rs rt ru bk">Reducing training windows, allowing us to reduce model drift.</li><li id="c397" class="os ot jl ou b ov rv ox oy oz rw pb pc hk rx pe pf hn ry ph pi hq rz pk pl pm rs rt ru bk">Feeding the model with granular geographic data and human-provided information about external factors as borders closed and reopened due to the pandemic.</li><li id="a664" class="os ot jl ou b ov rv ox oy oz rw pb pc hk rx pe pf hn ry ph pi hq rz pk pl pm rs rt ru bk">Adopting <a class="ag hz" href="http://lightgbm.readthedocs.io" rel="noopener ugc nofollow" target="_blank">LightGBM</a>, which handles high cardinality features like the geographic variables mentioned previously.</li></ul><h2 id="a1a2" class="rc qc jl bf qd hg rd fa hh hi re fc hj hk rf hl hm hn rg ho hp hq rh hr hs ri bk">Challenge (2): Measuring incrementality</h2><p id="e2cb" class="pw-post-body-paragraph os ot jl ou b ov qx ox oy oz qy pb pc hk qz pe pf hn ra ph pi hq rb pk pl pm jf bk">Accounting for incrementality is challenging because we never observe the ground truth. While we observe how many bookings are made per listing, we cannot tell which bookings are incremental and which bookings are cannibalized from other listings.</p><p id="edc9" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">Since we don’t have an incrementality label to estimate this outcome directly, we instead estimate a production function. Intuitively, incrementality is heavily dependent on our ability to connect both sides of our marketplace. Production functions allow us to identify when our supply of listings and demand from guests connect and provide incremental value. Incrementality estimates will be high when a segment has high guest demand and relatively low listing supply. In contrast, incrementality will be low when segments have a large volume of listing supply and relatively low demand, meaning guests have an easy time finding a place to stay and a new listing is more likely to cannibalize bookings from other listings.</p><p id="b6f1" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">Specifically, we model how our total supply of listings (S) and total demand from guests (D) impacts our target outcome bookings (O), as in equation (1):</p><figure class="pq pr ps pt pu pv pn po paragraph-image"><div role="button" tabindex="0" class="pw px gk py bh pz"><div class="pn po sa"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*ccUD00N5xq2IfTiMkZHNGA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*ccUD00N5xq2IfTiMkZHNGA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*ccUD00N5xq2IfTiMkZHNGA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*ccUD00N5xq2IfTiMkZHNGA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*ccUD00N5xq2IfTiMkZHNGA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*ccUD00N5xq2IfTiMkZHNGA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*ccUD00N5xq2IfTiMkZHNGA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*ccUD00N5xq2IfTiMkZHNGA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*ccUD00N5xq2IfTiMkZHNGA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*ccUD00N5xq2IfTiMkZHNGA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*ccUD00N5xq2IfTiMkZHNGA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*ccUD00N5xq2IfTiMkZHNGA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*ccUD00N5xq2IfTiMkZHNGA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*ccUD00N5xq2IfTiMkZHNGA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="8977" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">We estimate this model with historical supply, demand, and outcome data aggregated across internally-defined segments that have little overlapping demand. Having estimated model (1), we calculate how extra supply of listings results in additional bookings in the given segment: this is our estimate of incrementality.</p><h2 id="4f4b" class="rc qc jl bf qd hg rd fa hh hi re fc hj hk rf hl hm hn rg ho hp hq rh hr hs ri bk">Challenge (3): Handling uncertainty</h2><p id="273c" class="pw-post-body-paragraph os ot jl ou b ov qx ox oy oz qy pb pc hk qz pe pf hn ra ph pi hq rb pk pl pm jf bk">To handle the uncertainty we experienced during the pandemic, we began updating our LTV estimates as listings received greater or fewer numbers of bookings than initially expected. This approach has helped us capture any shocks that occur after making our initial predictions.</p><p id="0d8d" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">To show how this can be useful, let’s go back to our marketing campaign example. Assume that we run this campaign for six months, and that we measure the success of this campaign by comparing marketing-induced incremental LTV against our total marketing investment in the campaign. As a first approach, we could use the initial baseline LTV figures (which feed into marketing-induced LTV) estimated at the time when the listing was first targeted by our initiative. However, listings targeted on day 1 of the marketing campaign will have six months of booking history by the time the campaign ends and we evaluate success. A more accurate approach uses realized bookings after the initial prediction to start correcting for model error.</p><p id="cf5d" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">Table 2 illustrates how this works. Suppose that on 2024–01–01, we expect that Listing A will get a total of 16 bookings by the end of the year. If six months into the 365 day period, Listing A has received 16 bookings, we should adjust its expected value upward to, say, 21 bookings. In fact, every day for 365 days after 2024–01–01, we can look at the bookings that Listing A has accrued and adjust the expected bookings accordingly. By construction, the expected and accrued bookings converge to the final bookings 365 days after the initial booking date. Going back to our marketing example, if Listing A ultimately receives 20 bookings, updating the initial estimate means we went from 20% underprediction on day 0 to a more reasonable 5% overprediction as of month 6.</p><figure class="pq pr ps pt pu pv pn po paragraph-image"><div role="button" tabindex="0" class="pw px gk py bh pz"><div class="pn po sb"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*vNQ0046lY7rfWrHIJK6Oww.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*vNQ0046lY7rfWrHIJK6Oww.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*vNQ0046lY7rfWrHIJK6Oww.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*vNQ0046lY7rfWrHIJK6Oww.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*vNQ0046lY7rfWrHIJK6Oww.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*vNQ0046lY7rfWrHIJK6Oww.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*vNQ0046lY7rfWrHIJK6Oww.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*vNQ0046lY7rfWrHIJK6Oww.png 640w, https://miro.medium.com/v2/resize:fit:720/1*vNQ0046lY7rfWrHIJK6Oww.png 720w, https://miro.medium.com/v2/resize:fit:750/1*vNQ0046lY7rfWrHIJK6Oww.png 750w, https://miro.medium.com/v2/resize:fit:786/1*vNQ0046lY7rfWrHIJK6Oww.png 786w, https://miro.medium.com/v2/resize:fit:828/1*vNQ0046lY7rfWrHIJK6Oww.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*vNQ0046lY7rfWrHIJK6Oww.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*vNQ0046lY7rfWrHIJK6Oww.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="rk gg rl pn po rm rn bf b bg z cm"><strong class="bf qd">Table 2.</strong> <strong class="bf qd">Example of how we update listing lifetime value estimates.</strong></figcaption></figure><p id="5233" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">In practice, we make daily adjustments to a listing’s expected value based on the listing’s accrued value, updated listing features, and value arrival patterns for similar listings estimated using historical data.</p><h1 id="c694" class="qb qc jl bf qd qe qf qg hh qh qi qj hj qk ql qm qn qo qp qq qr qs qt qu qv qw bk">Conclusion</h1><p id="a926" class="pw-post-body-paragraph os ot jl ou b ov qx ox oy oz qy pb pc hk qz pe pf hn ra ph pi hq rb pk pl pm jf bk">In this blog post, we explained how we approach listing lifetime value at Airbnb. We covered our measurement framework, including baseline LTV, incremental LTV, and marketing-induced incremental LTV. We also zoomed into measurement challenges, like when travel patterns changed drastically during the COVID pandemic and accurately estimating LTV became more difficult.</p><p id="ed12" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">Estimating the lifetime value for each listing is important because it helps us serve our community more effectively. Use cases include:</p><ul class=""><li id="a59c" class="os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm rs rt ru bk">Identifying unique listing segments through which new hosts can showcase their hospitality to a large guest audience.</li><li id="2a2c" class="os ot jl ou b ov rv ox oy oz rw pb pc hk rx pe pf hn ry ph pi hq rz pk pl pm rs rt ru bk">Pinpointing locations where listings have an opportunity to get more bookings, and might benefit from additional demand.</li><li id="f85c" class="os ot jl ou b ov rv ox oy oz rw pb pc hk rx pe pf hn ry ph pi hq rz pk pl pm rs rt ru bk">Identifying which internal marketing initiatives bring the most value to our community.</li></ul><p id="12f9" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">It’s also worth noting that our measurement framework may extend to other applications, such as the lifetime value for Airbnb Experiences listings, where the value of an experience listing will heavily depend on travel trends and on guests’ ability to discover these experiences.</p><p id="853c" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">We continue to solve interesting problems around LTV every day (and as more insights come up, we’ll keep sharing them on our blog). Can you see yourself making an impact here? If so, we encourage you to explore the <a class="ag hz" href="https://careers.airbnb.com/positions/?_departments=data-science" rel="noopener ugc nofollow" target="_blank">open roles on our team</a>.</p></div></div></div><div class="ab de sc sd se sf" role="separator"><div class="jf im jg jh ji"><div class="ab de"><div class="dl bh is it iu iv"><h1 id="93fb" class="qb qc jl bf qd qe sj qg hh qh sk qj hj qk sl qm qn qo sm qq qr qs sn qu qv qw bk">Acknowledgments</h1><p id="8a02" class="pw-post-body-paragraph os ot jl ou b ov qx ox oy oz qy pb pc hk qz pe pf hn ra ph pi hq rb pk pl pm jf bk">Finally, we need to give special thanks to Airfam and alumni Sam Barrows, Robert Chang, Linsha Chen, Richard Dear, Ruben Lobel, Brian De Luna, Dan T. Nguyen, Vaughn Quoss, Jason Ting, and Peng Ye. Without their foundational work, these LTV models would not have been possible.</p><p id="786e" class="pw-post-body-paragraph os ot jl ou b ov ow ox oy oz pa pb pc hk pd pe pf hn pg ph pi hq pj pk pl pm jf bk">Thanks as well to Rebecca Ajuonuma, Carolina Barcenas, Nathan Brixius, Jenny Chen, Peter Coles, Lauren Mackevich, Dan Schmierer, Yvonne Wang, Shanni Weilert, and Jane Zhang for their valuable feedback when writing this blog post.</p></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/how-airbnb-measures-listing-lifetime-value-a603bf05142c</link>
      <guid>https://medium.com/airbnb-engineering/how-airbnb-measures-listing-lifetime-value-a603bf05142c</guid>
      <pubDate>Wed, 26 Mar 2025 16:46:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Embedding-Based Retrieval for Airbnb Search]]></title>
      <description><![CDATA[<div><div></div><figure class="ot ou ov ow ox oy oq or paragraph-image"><div role="button" tabindex="0" class="oz pa gi pb bh pc"><div class="oq or os"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*dhEL1kHnOpCWnqJa 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*dhEL1kHnOpCWnqJa 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*dhEL1kHnOpCWnqJa 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*dhEL1kHnOpCWnqJa 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*dhEL1kHnOpCWnqJa 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*dhEL1kHnOpCWnqJa 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*dhEL1kHnOpCWnqJa 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*dhEL1kHnOpCWnqJa 640w, https://miro.medium.com/v2/resize:fit:720/0*dhEL1kHnOpCWnqJa 720w, https://miro.medium.com/v2/resize:fit:750/0*dhEL1kHnOpCWnqJa 750w, https://miro.medium.com/v2/resize:fit:786/0*dhEL1kHnOpCWnqJa 786w, https://miro.medium.com/v2/resize:fit:828/0*dhEL1kHnOpCWnqJa 828w, https://miro.medium.com/v2/resize:fit:1100/0*dhEL1kHnOpCWnqJa 1100w, https://miro.medium.com/v2/resize:fit:1400/0*dhEL1kHnOpCWnqJa 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="ccd8" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">Our journey in applying embedding-based retrieval techniques to build an accurate and scalable candidate retrieval system for Airbnb Homes search</p><p id="c492" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">Authors: <a class="ag hx" href="https://www.linkedin.com/in/mustafa-moose-abdool-8aab037a/" rel="noopener ugc nofollow" target="_blank">Mustafa (Moose) Abdool</a>, <a class="ag hx" href="https://www.linkedin.com/in/soumyadip-banerjee-75991b42/" rel="noopener ugc nofollow" target="_blank">Soumyadip Banerjee</a>, <a class="ag hx" href="https://www.linkedin.com/in/kouyang1/" rel="noopener ugc nofollow" target="_blank">Karen Ouyang</a>, <a class="ag hx" href="https://www.linkedin.com/in/do-kyum-kim-9a810417/" rel="noopener ugc nofollow" target="_blank">Do-Kyum Kim</a>, <a class="ag hx" href="https://www.linkedin.com/in/moutupsi-paul/" rel="noopener ugc nofollow" target="_blank">Moutupsi Paul</a>, <a class="ag hx" href="https://www.linkedin.com/in/xiaowei-liu-60415841/" rel="noopener ugc nofollow" target="_blank">Xiaowei Liu</a>, <a class="ag hx" href="https://www.linkedin.com/in/bin-xu-96253aa5/" rel="noopener ugc nofollow" target="_blank">Bin Xu</a>, <a class="ag hx" href="https://www.linkedin.com/in/tracy-xiaoxi-yu/" rel="noopener ugc nofollow" target="_blank">Tracy Yu</a>, <a class="ag hx" href="https://www.linkedin.com/in/hui-gao-275a924/" rel="noopener ugc nofollow" target="_blank">Hui Gao</a>, <a class="ag hx" href="https://www.linkedin.com/in/yangbo-zhu/" rel="noopener ugc nofollow" target="_blank">Yangbo Zhu</a>, <a class="ag hx" href="https://www.linkedin.com/in/huiji-gao/" rel="noopener ugc nofollow" target="_blank">Huiji Gao</a>, <a class="ag hx" href="https://www.linkedin.com/in/liweihe/" rel="noopener ugc nofollow" target="_blank">Liwei He</a>, <a class="ag hx" href="https://www.linkedin.com/in/sanjeevkatariya/" rel="noopener ugc nofollow" target="_blank">Sanjeev Katariya</a></p><h1 id="9f6c" class="pz qa jj bf qb qc qd qe hf qf qg qh hh qi qj qk ql qm qn qo qp qq qr qs qt qu bk">Introduction</h1><p id="54e7" class="pw-post-body-paragraph pe pf jj pg b ph qv pj pk pl qw pn po hi qx pq pr hl qy pt pu ho qz pw px py jd bk">Search plays a crucial role in helping Airbnb guests find the perfect stay. The goal of Airbnb Search is to surface the most relevant listings for each user’s query — but with millions of available homes, that’s no easy task. It’s especially difficult when searches include large geographic areas (like California or France) or high-demand destinations (like Paris or London). Recent innovations — such as <em class="ra">flexible date search</em>, which allows guests to explore stays without fixed check-in and check-out dates — have added yet another layer of complexity to ranking and finding the right results.</p><p id="93c9" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">To tackle these challenges, we need a system that can retrieve relevant homes while also being scalable enough (in terms of latency and compute) to handle queries with a large candidate count. In this blog post, we share our journey in building Airbnb’s first-ever Embedding-Based Retrieval (EBR) search system. The goal of this system is to narrow down the initial set of eligible homes into a smaller pool, which can then be scored by more compute-intensive machine learning models later in the search ranking process.</p><figure class="rc rd re rf rg oy oq or paragraph-image"><div role="button" tabindex="0" class="oz pa gi pb bh pc"><div class="oq or rb"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*TCeRWXyWhaTJeGfp 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*TCeRWXyWhaTJeGfp 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*TCeRWXyWhaTJeGfp 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*TCeRWXyWhaTJeGfp 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*TCeRWXyWhaTJeGfp 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*TCeRWXyWhaTJeGfp 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*TCeRWXyWhaTJeGfp 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*TCeRWXyWhaTJeGfp 640w, https://miro.medium.com/v2/resize:fit:720/0*TCeRWXyWhaTJeGfp 720w, https://miro.medium.com/v2/resize:fit:750/0*TCeRWXyWhaTJeGfp 750w, https://miro.medium.com/v2/resize:fit:786/0*TCeRWXyWhaTJeGfp 786w, https://miro.medium.com/v2/resize:fit:828/0*TCeRWXyWhaTJeGfp 828w, https://miro.medium.com/v2/resize:fit:1100/0*TCeRWXyWhaTJeGfp 1100w, https://miro.medium.com/v2/resize:fit:1400/0*TCeRWXyWhaTJeGfp 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="7e29" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk"><strong class="pg jk">Figure 1:</strong> The general stages and scale for the various types of ranking models used in Airbnb Search</p><p id="ed08" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">We’ll explore three key challenges in building this EBR system: (1) constructing training data, (2) designing the model architecture, and (3) developing an online serving strategy using Approximate Nearest Neighbor (ANN) solutions.</p><h1 id="7ffa" class="pz qa jj bf qb qc qd qe hf qf qg qh hh qi qj qk ql qm qn qo qp qq qr qs qt qu bk">Training Data Construction</h1><p id="89fc" class="pw-post-body-paragraph pe pf jj pg b ph qv pj pk pl qw pn po hi qx pq pr hl qy pt pu ho qz pw px py jd bk">The first step in building our EBR system was training a machine learning model to map both homes and de-identified search queries into numerical vectors. To achieve this, we built a training data pipeline (Figure 3) that leveraged contrastive learning — a strategy that involves identifying pairs of positive- and negative-labeled homes for a given query. During training, the model learns to map a query, a positive home, and a negative home into a numerical vector, such that the similarity between the query and the positive home is much higher than the similarity between the query and the negative home.</p><p id="3d7e" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">To construct these pairs, we devised a sampling method based on user trips. This was an important design decision, since users on Airbnb generally undergo a multi-stage search journey. Data shows that before making a final booking, users tend to perform multiple searches and take various actions — such as clicking into a home’s details, reading reviews, or adding a home to a wishlist. As such, it was crucial to develop a strategy that captures this entire multi-stage journey and accounts for the diverse types of listings a user might explore.</p><p id="1079" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">Diving deeper, we first grouped all historical queries of users who made bookings, using key query parameters such as location, number of guests, and length of stay — our definition of a “trip.” For each trip, we analyzed all searches performed by the user, with the final booked listing as the positive label. To construct (positive, negative) pairs, we paired this booked listing with other homes the user had seen but not booked. Negative labels were selected from homes the user encountered in search results, along with those they had interacted with more intentfully — such as by wishlisting — but ultimately did not book. This choice of negative labels was key: Randomly sampling homes made the problem too easy and resulted in poor model performance.</p><figure class="rc rd re rf rg oy oq or paragraph-image"><div role="button" tabindex="0" class="oz pa gi pb bh pc"><div class="oq or rh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*3rXf0K0bJoObo17- 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*3rXf0K0bJoObo17- 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*3rXf0K0bJoObo17- 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*3rXf0K0bJoObo17- 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*3rXf0K0bJoObo17- 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*3rXf0K0bJoObo17- 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*3rXf0K0bJoObo17- 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*3rXf0K0bJoObo17- 640w, https://miro.medium.com/v2/resize:fit:720/0*3rXf0K0bJoObo17- 720w, https://miro.medium.com/v2/resize:fit:750/0*3rXf0K0bJoObo17- 750w, https://miro.medium.com/v2/resize:fit:786/0*3rXf0K0bJoObo17- 786w, https://miro.medium.com/v2/resize:fit:828/0*3rXf0K0bJoObo17- 828w, https://miro.medium.com/v2/resize:fit:1100/0*3rXf0K0bJoObo17- 1100w, https://miro.medium.com/v2/resize:fit:1400/0*3rXf0K0bJoObo17- 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="f1d7" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk"><strong class="pg jk">Figure 2: </strong>Example of constructing (positive, negative) pairs for a given user journey. The booked home is always treated as a positive. Negatives are selected from homes that appeared in the search result (and were potentially interacted with) but that the user did not end up booking.</p><figure class="rc rd re rf rg oy oq or paragraph-image"><div class="oq or ri"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*AUApsIPfEFdmx_S- 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*AUApsIPfEFdmx_S- 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*AUApsIPfEFdmx_S- 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*AUApsIPfEFdmx_S- 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*AUApsIPfEFdmx_S- 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*AUApsIPfEFdmx_S- 1100w, https://miro.medium.com/v2/resize:fit:1116/format:webp/0*AUApsIPfEFdmx_S- 1116w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 558px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*AUApsIPfEFdmx_S- 640w, https://miro.medium.com/v2/resize:fit:720/0*AUApsIPfEFdmx_S- 720w, https://miro.medium.com/v2/resize:fit:750/0*AUApsIPfEFdmx_S- 750w, https://miro.medium.com/v2/resize:fit:786/0*AUApsIPfEFdmx_S- 786w, https://miro.medium.com/v2/resize:fit:828/0*AUApsIPfEFdmx_S- 828w, https://miro.medium.com/v2/resize:fit:1100/0*AUApsIPfEFdmx_S- 1100w, https://miro.medium.com/v2/resize:fit:1116/0*AUApsIPfEFdmx_S- 1116w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 558px" /></picture></div></figure><p id="a19a" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk"><strong class="pg jk">Figure 3: </strong>Example of overall data pipeline used to construct training data for the EBR model.</p><h1 id="a19c" class="pz qa jj bf qb qc qd qe hf qf qg qh hh qi qj qk ql qm qn qo qp qq qr qs qt qu bk">Model Architecture</h1><p id="7c66" class="pw-post-body-paragraph pe pf jj pg b ph qv pj pk pl qw pn po hi qx pq pr hl qy pt pu ho qz pw px py jd bk">The model architecture followed a traditional two-tower network design. One tower (the <em class="ra">listing tower</em>) processes features about the home listing itself — such as historical engagement, amenities, and guest capacity. The other tower (the <em class="ra">query tower</em>) processes features related to the search query — such as the geographic search location, number of guests, and length of stay. Together, these towers generate the embeddings for home listings and search queries, respectively.</p><p id="133b" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">A key design decision here was choosing features such that the listing tower could be computed offline on a daily basis. This enabled us to pre-compute the home embeddings in a daily batch job, significantly reducing online latency, since only the query tower had to be evaluated in real-time for incoming search requests.</p><figure class="rc rd re rf rg oy oq or paragraph-image"><div role="button" tabindex="0" class="oz pa gi pb bh pc"><div class="oq or rj"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*rLXZmgqFE5BiSOGS 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*rLXZmgqFE5BiSOGS 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*rLXZmgqFE5BiSOGS 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*rLXZmgqFE5BiSOGS 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*rLXZmgqFE5BiSOGS 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*rLXZmgqFE5BiSOGS 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*rLXZmgqFE5BiSOGS 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*rLXZmgqFE5BiSOGS 640w, https://miro.medium.com/v2/resize:fit:720/0*rLXZmgqFE5BiSOGS 720w, https://miro.medium.com/v2/resize:fit:750/0*rLXZmgqFE5BiSOGS 750w, https://miro.medium.com/v2/resize:fit:786/0*rLXZmgqFE5BiSOGS 786w, https://miro.medium.com/v2/resize:fit:828/0*rLXZmgqFE5BiSOGS 828w, https://miro.medium.com/v2/resize:fit:1100/0*rLXZmgqFE5BiSOGS 1100w, https://miro.medium.com/v2/resize:fit:1400/0*rLXZmgqFE5BiSOGS 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="65cc" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk"><strong class="pg jk">Figure 4: </strong>Two-tower architecture as used in the EBR model. Note that the listing tower is computed offline daily for all homes.</p><h1 id="951a" class="pz qa jj bf qb qc qd qe hf qf qg qh hh qi qj qk ql qm qn qo qp qq qr qs qt qu bk">Online Serving</h1><p id="6409" class="pw-post-body-paragraph pe pf jj pg b ph qv pj pk pl qw pn po hi qx pq pr hl qy pt pu ho qz pw px py jd bk">The final step in building our EBR system was choosing the infrastructure for online serving. We explored a number of approximate nearest neighbor (ANN) solutions and narrowed them down to two main candidates: inverted file index (IVF) and hierarchical navigable small worlds (HNSW). While HNSW performed slightly better in terms of evaluation metrics — using recall as our main evaluation metric — we ultimately found that IVF offered the best trade-off between speed and performance.</p><p id="6dce" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">The core reason for this is the high volume of real-time updates per second for Airbnb home listings, as pricing and availability data is frequently updated. This caused the memory footprint of the HNSW index to grow too large. In addition, most Airbnb searches include filters, especially geographic filters. We found that parallel retrieval with HNSW alongside filters resulted in poor latency performance.</p><p id="5742" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">In contrast, the IVF solution, where listings are clustered beforehand, only required storing cluster centroids and cluster assignments within our search index. At serving time, we simply retrieve listings from the top clusters by treating the cluster assignments as a standard search filter, making integration with our existing search system quite straightforward.</p><figure class="rc rd re rf rg oy oq or paragraph-image"><div class="oq or rk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*WH3lbXvph3aBkPBY 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*WH3lbXvph3aBkPBY 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*WH3lbXvph3aBkPBY 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*WH3lbXvph3aBkPBY 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*WH3lbXvph3aBkPBY 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*WH3lbXvph3aBkPBY 1100w, https://miro.medium.com/v2/resize:fit:1206/format:webp/0*WH3lbXvph3aBkPBY 1206w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 603px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*WH3lbXvph3aBkPBY 640w, https://miro.medium.com/v2/resize:fit:720/0*WH3lbXvph3aBkPBY 720w, https://miro.medium.com/v2/resize:fit:750/0*WH3lbXvph3aBkPBY 750w, https://miro.medium.com/v2/resize:fit:786/0*WH3lbXvph3aBkPBY 786w, https://miro.medium.com/v2/resize:fit:828/0*WH3lbXvph3aBkPBY 828w, https://miro.medium.com/v2/resize:fit:1100/0*WH3lbXvph3aBkPBY 1100w, https://miro.medium.com/v2/resize:fit:1206/0*WH3lbXvph3aBkPBY 1206w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 603px" /></picture></div></figure><p id="0041" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk"><strong class="pg jk">Figure 5: </strong>Overall serving flow using IVF. Homes are clustered beforehand and, during online serving, homes are retrieved from the closest clusters to the query embedding.</p><p id="1ffa" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">In this approach, our choice of similarity function in the EBR model itself ended up having interesting implications. We explored both dot product and Euclidean distance; while both performed similarly from a model perspective, using Euclidean distance produced much more balanced clusters on average. This was a key insight, as the quality of IVF retrieval is highly sensitive to cluster size uniformity: If one cluster had too many homes, it would greatly reduce the discriminative power of our retrieval system.</p><p id="d4a9" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">We hypothesize that this imbalance arises with dot product similarity because it inherently only considers the direction of feature vectors while ignoring their magnitudes — whereas many of our underlying features are based on historical counts, making magnitude an important factor.</p><figure class="rc rd re rf rg oy oq or paragraph-image"><div role="button" tabindex="0" class="oz pa gi pb bh pc"><div class="oq or rl"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*RlVEZwdCwA5j4cwo 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*RlVEZwdCwA5j4cwo 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*RlVEZwdCwA5j4cwo 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*RlVEZwdCwA5j4cwo 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*RlVEZwdCwA5j4cwo 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*RlVEZwdCwA5j4cwo 1100w, https://miro.medium.com/v2/resize:fit:1390/format:webp/0*RlVEZwdCwA5j4cwo 1390w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 695px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*RlVEZwdCwA5j4cwo 640w, https://miro.medium.com/v2/resize:fit:720/0*RlVEZwdCwA5j4cwo 720w, https://miro.medium.com/v2/resize:fit:750/0*RlVEZwdCwA5j4cwo 750w, https://miro.medium.com/v2/resize:fit:786/0*RlVEZwdCwA5j4cwo 786w, https://miro.medium.com/v2/resize:fit:828/0*RlVEZwdCwA5j4cwo 828w, https://miro.medium.com/v2/resize:fit:1100/0*RlVEZwdCwA5j4cwo 1100w, https://miro.medium.com/v2/resize:fit:1390/0*RlVEZwdCwA5j4cwo 1390w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 695px" /></picture></div></div></figure><p id="9d29" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk"><strong class="pg jk">Figure 6: </strong>Example of the distribution of cluster sizes when using dot product vs. Euclidean distance as a similarity measure. We found that Euclidean distance produced much more balanced cluster sizes.</p><h1 id="474a" class="pz qa jj bf qb qc qd qe hf qf qg qh hh qi qj qk ql qm qn qo qp qq qr qs qt qu bk">Results</h1><p id="fd5f" class="pw-post-body-paragraph pe pf jj pg b ph qv pj pk pl qw pn po hi qx pq pr hl qy pt pu ho qz pw px py jd bk">The EBR system described in this post was fully launched in both Search and Email Marketing production and led to a statistically-significant gain in overall bookings when A/B tested. Notably, the bookings lift from this new retrieval system was on par with some of the largest machine learning improvements to our search ranking in the past two years.</p><p id="3fc9" class="pw-post-body-paragraph pe pf jj pg b ph pi pj pk pl pm pn po hi pp pq pr hl ps pt pu ho pv pw px py jd bk">The key improvement over the baseline was that our EBR system effectively incorporated query context, allowing homes to be ranked more accurately during retrieval. This ultimately helped us display more relevant results to users, especially for queries with a high number of eligible results.</p><h1 id="bc88" class="pz qa jj bf qb qc qd qe hf qf qg qh hh qi qj qk ql qm qn qo qp qq qr qs qt qu bk">Acknowledgments</h1><p id="d216" class="pw-post-body-paragraph pe pf jj pg b ph qv pj pk pl qw pn po hi qx pq pr hl qy pt pu ho qz pw px py jd bk">We would like to especially thank the entire Search and Knowledge Infrastructure &amp; ML Infrastructure org (led by <a class="ag hx" href="https://www.linkedin.com/in/yi-li-755a6b24/" rel="noopener ugc nofollow" target="_blank">Yi Li</a>) and Marketing Technology org (led by <a class="ag hx" href="https://www.linkedin.com/in/michael-kinoti-7a309215/" rel="noopener ugc nofollow" target="_blank">Michael Kinoti</a>) for their great collaborations throughout this project!</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/embedding-based-retrieval-for-airbnb-search-aabebfc85839</link>
      <guid>https://medium.com/airbnb-engineering/embedding-based-retrieval-for-airbnb-search-aabebfc85839</guid>
      <pubDate>Wed, 19 Mar 2025 18:02:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Accelerating Large-Scale Test Migration with LLMs]]></title>
      <description><![CDATA[<div><div></div><figure class="mx my mz na nb nc mu mv paragraph-image"><div role="button" tabindex="0" class="nd ne fj nf bh ng"><div class="mu mv mw"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*j0QXnA13Sy5ruaIAU5C_eg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="4c1e" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">by: Charles Covey-Brandt</p><p id="89dd" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">Airbnb recently completed our first large-scale, LLM-driven code migration, updating nearly 3.5K React component test files from Enzyme to use React Testing Library (RTL) instead. We’d originally estimated this would take 1.5 years of engineering time to do by hand, but — using a combination of frontier models and robust automation — we finished the entire migration in just 6 weeks.</p><p id="29b0" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">In this blog post, we’ll highlight the unique challenges we faced migrating from Enzyme to RTL, how LLMs excel at solving this particular type of challenge, and how we structured our migration tooling to run an LLM-driven migration at scale.</p><h1 id="d5be" class="og oh gu bf oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bk">Background</h1><p id="4b0b" class="pw-post-body-paragraph ni nj gu nk b nl pe nn no np pf nr ns nt pg nv nw nx ph nz oa ob pi od oe of gn bk">In 2020, Airbnb adopted React Testing Library (RTL) for all new React component test development, marking our first steps away from Enzyme. Although Enzyme had served us well since 2015, it was designed for earlier versions of React, and the framework’s deep access to component internals no longer aligned with modern React testing practices.</p><p id="afd1" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">However, because of the fundamental differences between these frameworks, we couldn’t easily swap out one for the other (read more about the differences <a class="af pj" href="https://kentcdodds.com/blog/introducing-the-react-testing-library" rel="noopener ugc nofollow" target="_blank">here</a>). We also couldn’t just delete the Enzyme files, as analysis showed this would create significant gaps in our code coverage. To complete this migration, we needed an automated way to refactor test files from Enzyme to RTL while preserving the intent of the original tests <em class="pk">and</em> their code coverage.</p><h1 id="3b8e" class="og oh gu bf oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bk">How We Did It</h1><p id="4e1d" class="pw-post-body-paragraph ni nj gu nk b nl pe nn no np pf nr ns nt pg nv nw nx ph nz oa ob pi od oe of gn bk">In mid-2023, an Airbnb hackathon team demonstrated that large language models could successfully convert hundreds of Enzyme files to RTL in just a few days.</p><p id="5756" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">Building on this promising result, in 2024 we developed a scalable pipeline for an LLM-driven migration. We broke the migration into discrete, per-file steps that we could parallelize, and configurable retry loops, and significantly expanded our prompts with additional context. Finally, we performed breadth-first prompt tuning for the long tail of complex files.</p><h1 id="f20f" class="og oh gu bf oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bk">1. File Validation and Refactor Steps</h1><p id="bb71" class="pw-post-body-paragraph ni nj gu nk b nl pe nn no np pf nr ns nt pg nv nw nx ph nz oa ob pi od oe of gn bk">We started by breaking down the migration into a series of automated validation and refactor steps. Think of it like a production pipeline: each file moves through stages of validation, and when a check fails, we bring in the LLM to fix it.</p><p id="7f54" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">We modeled this flow like a state machine, moving the file to the next state only after validation on the previous state passed:</p><figure class="pm pn po pp pq nc mu mv paragraph-image"><div class="mu mv pl"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*cHwpLgo6nzx8bROe 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*cHwpLgo6nzx8bROe 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*cHwpLgo6nzx8bROe 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*cHwpLgo6nzx8bROe 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*cHwpLgo6nzx8bROe 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*cHwpLgo6nzx8bROe 1100w, https://miro.medium.com/v2/resize:fit:1174/format:webp/0*cHwpLgo6nzx8bROe 1174w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 587px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*cHwpLgo6nzx8bROe 640w, https://miro.medium.com/v2/resize:fit:720/0*cHwpLgo6nzx8bROe 720w, https://miro.medium.com/v2/resize:fit:750/0*cHwpLgo6nzx8bROe 750w, https://miro.medium.com/v2/resize:fit:786/0*cHwpLgo6nzx8bROe 786w, https://miro.medium.com/v2/resize:fit:828/0*cHwpLgo6nzx8bROe 828w, https://miro.medium.com/v2/resize:fit:1100/0*cHwpLgo6nzx8bROe 1100w, https://miro.medium.com/v2/resize:fit:1174/0*cHwpLgo6nzx8bROe 1174w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 587px" /></picture></div><figcaption class="pr ff ps mu mv pt pu bf b bg z du">Diagram shows refactor steps from Enzyme refactor, fixing Jest, fixing lint and tsc, and marking file as complete.</figcaption></figure><p id="e613" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">This step-based approach provided a solid foundation for our automation pipeline. It enabled us to track progress, improve failure rates for specific steps, and rerun files or steps when needed. The step-based approach also made it simple to run migrations on hundreds of files concurrently, which was critical for both quickly migrating simple files, and chipping away at the long tail of files later in the migration.</p><h1 id="0e90" class="og oh gu bf oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bk">2. Retry Loops &amp; Dynamic Prompting</h1><p id="bd6b" class="pw-post-body-paragraph ni nj gu nk b nl pe nn no np pf nr ns nt pg nv nw nx ph nz oa ob pi od oe of gn bk">Early on in the migration, we experimented with different prompt engineering strategies to improve our per-file migration success rate. However, building on the stepped approach, we found the most effective route to improve outcomes was simply brute force: retry steps multiple times until they passed or we reached a limit. We updated our steps to use dynamic prompts for each retry, giving the validation errors and the most recent version of the file to the LLM, and built a loop runner that ran each step up to a configurable number of attempts.</p><figure class="pm pn po pp pq nc mu mv paragraph-image"><div role="button" tabindex="0" class="nd ne fj nf bh ng"><div class="mu mv pv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*XtBaeswbgYOBY_uP 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*XtBaeswbgYOBY_uP 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*XtBaeswbgYOBY_uP 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*XtBaeswbgYOBY_uP 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*XtBaeswbgYOBY_uP 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*XtBaeswbgYOBY_uP 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*XtBaeswbgYOBY_uP 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*XtBaeswbgYOBY_uP 640w, https://miro.medium.com/v2/resize:fit:720/0*XtBaeswbgYOBY_uP 720w, https://miro.medium.com/v2/resize:fit:750/0*XtBaeswbgYOBY_uP 750w, https://miro.medium.com/v2/resize:fit:786/0*XtBaeswbgYOBY_uP 786w, https://miro.medium.com/v2/resize:fit:828/0*XtBaeswbgYOBY_uP 828w, https://miro.medium.com/v2/resize:fit:1100/0*XtBaeswbgYOBY_uP 1100w, https://miro.medium.com/v2/resize:fit:1400/0*XtBaeswbgYOBY_uP 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr ff ps mu mv pt pu bf b bg z du"><em class="pw">Diagram of a retry loop. For a given step N, if the file has errors, we retry validation and attempt to fix errors unless we hit the max retries or the file no longer contains errors.</em></figcaption></figure><p id="8ba7" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">With this simple retry loop, we found we could successfully migrate a large number of our simple-to-medium complexity test files, with some finishing successfully after a few retries, and most by 10 attempts.</p><h1 id="c032" class="og oh gu bf oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bk">3. Increasing the Context</h1><p id="e324" class="pw-post-body-paragraph ni nj gu nk b nl pe nn no np pf nr ns nt pg nv nw nx ph nz oa ob pi od oe of gn bk">For test files up to a certain complexity, just increasing our retry attempts worked well. However, to handle files with intricate test state setups or excessive indirection, we found the best approach was to push as much relevant context as possible into our prompts.</p><p id="05f4" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">By the end of the migration, our prompts had expanded to anywhere between 40,000 to 100,000 tokens, pulling in as many as 50 related files, a whole host of manually written <a class="af pj" href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/multishot-prompting" rel="noopener ugc nofollow" target="_blank">few-shot</a> examples, as well as examples of existing, well-written, passing test files from within the same project.</p><p id="3514" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">Each prompt included:</p><ul class=""><li id="e3f8" class="ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of px py pz bk">The source code of the component under test</li><li id="1fdc" class="ni nj gu nk b nl qa nn no np qb nr ns nt qc nv nw nx qd nz oa ob qe od oe of px py pz bk">The test file we were migrating</li><li id="eb57" class="ni nj gu nk b nl qa nn no np qb nr ns nt qc nv nw nx qd nz oa ob qe od oe of px py pz bk">Validation failures for the step</li><li id="b55a" class="ni nj gu nk b nl qa nn no np qb nr ns nt qc nv nw nx qd nz oa ob qe od oe of px py pz bk">Related tests from the same directory (maintaining team-specific patterns)</li><li id="7906" class="ni nj gu nk b nl qa nn no np qb nr ns nt qc nv nw nx qd nz oa ob qe od oe of px py pz bk">General migration guidelines and common solutions</li></ul><p id="fecc" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">Here’s how that looked in practice (significantly trimmed down for readability):</p><pre class="pm pn po pp pq qf qg qh bp qi bb bk">// Code example shows a trimmed down version of a prompt <br />// including the raw source code from related files, imports, <br />// examples, the component source itself, and the test file to migrate.const prompt = [<br />  'Convert this Enzyme test to React Testing Library:',<br />  `SIBLING TESTS:\n${siblingTestFilesSourceCode}`,<br />  `RTL EXAMPLES:\n${reactTestingLibraryExamples}`,<br />  `IMPORTS:\n${nearestImportSourceCode}`,<br />  `COMPONENT SOURCE:\n${componentFileSourceCode}`,<br />  `TEST TO MIGRATE:\n${testFileSourceCode}`,<br />].join('\n\n');</pre><p id="d19b" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">This rich context approach proved highly effective for these more complex files — the LLM could better understand team-specific patterns, common testing approaches, and the overall architecture of the codebase.</p><p id="6d25" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">We should note that, although we did some prompt engineering at this step, the main success driver we saw was choosing the <em class="pk">right</em> related files (finding nearby files, good example files from the same project, filtering the dependencies for files that were relevant to the component, etc.), rather than getting the prompt engineering perfect.</p><p id="d2c1" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">After building and testing our migration scripts with retries and rich contexts, when we ran our first bulk run, <strong class="nk gv">we successfully migrated 75% of our target files in just four hours</strong>.</p><h1 id="93c5" class="og oh gu bf oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bk">4. From 75% to 97%: Systematic Improvement</h1><p id="c73c" class="pw-post-body-paragraph ni nj gu nk b nl pe nn no np pf nr ns nt pg nv nw nx ph nz oa ob pi od oe of gn bk">That 75% success rate was really exciting to get to, but it still left us with nearly 900 files failing our step-based validation criteria. To tackle this long tail, we needed a systematic way to understand where remaining files were getting stuck and improve our migration scripts to address these issues. We also wanted to do this <em class="pk">breadth first</em> to aggressively chip away at our remaining files without getting stuck on the most difficult migration cases.</p><p id="690c" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">To do this, we built two features into our migration tooling.</p><p id="915a" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">First, we built a simple system to give us visibility into common issues our scripts were facing by stamping files with an automatically-generated comment to record the status of each migration step. Here’s what that code comment looked like:</p><pre class="pm pn po pp pq qf qg qh bp qi bb bk">// MIGRATION STATUS: {"enyzme":"done","jest":{"passed":8,"failed":2,"total":10,"skipped":0,"successRate":80},"eslint":"pending","tsc":"pending",}</pre><p id="5549" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">And second, we added the ability to easily re-run single files or path patterns, filtered by the specific step they were stuck on:</p><pre class="pm pn po pp pq qf qg qh bp qi bb bk">$ llm-bulk-migration --step=fix-jest --match=project-abc/**</pre><p id="1aa6" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">Using these two features, we could quickly run a feedback loop to improve our prompts and tooling:</p><ol class=""><li id="3d22" class="ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of qo py pz bk">Run all remaining failing files to find common issues the LLM is getting stuck on</li><li id="b93e" class="ni nj gu nk b nl qa nn no np qb nr ns nt qc nv nw nx qd nz oa ob qe od oe of qo py pz bk">Select a sample of files (5 to 10) that exemplify a common issue</li><li id="f507" class="ni nj gu nk b nl qa nn no np qb nr ns nt qc nv nw nx qd nz oa ob qe od oe of qo py pz bk">Update our prompts and scripts to address that issue</li><li id="5da3" class="ni nj gu nk b nl qa nn no np qb nr ns nt qc nv nw nx qd nz oa ob qe od oe of qo py pz bk">Re-run against the sample of failing files to validate our fix</li><li id="5361" class="ni nj gu nk b nl qa nn no np qb nr ns nt qc nv nw nx qd nz oa ob qe od oe of qo py pz bk">Repeat by running against all remaining files again</li></ol><p id="f527" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">After running this “sample, tune, sweep” loop for 4 days, we had pushed our completed files from 75% to 97% of the total files, and had just under 100 files remaining. By this point, we had retried many of these long tail files anywhere between 50 to 100 times, and it seemed we were pushing into a ceiling of what we could fix via automation. Rather than invest in more tuning, we opted to manually fix the remaining files, working from the baseline (failing) refactors to reduce the time to get those files over the finish line.</p><h1 id="7d0c" class="og oh gu bf oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bk">Results and Impact</h1><p id="bc49" class="pw-post-body-paragraph ni nj gu nk b nl pe nn no np pf nr ns nt pg nv nw nx ph nz oa ob pi od oe of gn bk">With the validation and refactor pipeline, retry loops, and expanded context in place, we were able to automatically migrate 75% of our target files in 4 hours.</p><p id="a684" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">After four days of prompt and script refinement using the “sample, tune, and sweep” strategy, we reached 97% of the 3.5K original Enzyme files.</p><p id="093c" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">And for the remaining 3% of files that didn’t complete through automation, our scripts provided a great baseline for manual intervention, allowing us to complete the migration for those remaining files in another week of work.</p><p id="7a35" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">Most importantly, we were able to replace Enzyme while maintaining original test intent and our overall code coverage. And even with high retry counts on the long tail of the migration, the total cost — including LLM API usage and six weeks of engineering time — proved far more efficient than our original manual migration estimate.</p><h1 id="74d4" class="og oh gu bf oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bk">What’s Next</h1><p id="f749" class="pw-post-body-paragraph ni nj gu nk b nl pe nn no np pf nr ns nt pg nv nw nx ph nz oa ob pi od oe of gn bk">This migration underscores the power of LLMs for large-scale code transformation. We plan to expand this approach, develop more sophisticated migration tools, and explore new applications of LLM-powered automation to enhance developer productivity.</p><p id="3f61" class="pw-post-body-paragraph ni nj gu nk b nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of gn bk">Want to help shape the future of developer tools? We’re hiring engineers who love solving complex problems at scale. Check out our <a class="af pj" href="https://careers.airbnb.com" rel="noopener ugc nofollow" target="_blank">careers page</a> to learn more.</p><h1 id="16b0" class="og oh gu bf oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bk">****************</h1><p id="204c" class="pw-post-body-paragraph ni nj gu nk b nl pe nn no np pf nr ns nt pg nv nw nx ph nz oa ob pi od oe of gn bk"><em class="pk">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/accelerating-large-scale-test-migration-with-llms-9565c208023b</link>
      <guid>https://medium.com/airbnb-engineering/accelerating-large-scale-test-migration-with-llms-9565c208023b</guid>
      <pubDate>Thu, 13 Mar 2025 18:01:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Improving Search Ranking for Maps]]></title>
      <description><![CDATA[<div><div></div><p id="b93d" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">How Airbnb is adapting ranking for our map interface.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv nw"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*DO7m1JZFPSvVRlBG 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*DO7m1JZFPSvVRlBG 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*DO7m1JZFPSvVRlBG 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*DO7m1JZFPSvVRlBG 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*DO7m1JZFPSvVRlBG 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*DO7m1JZFPSvVRlBG 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*DO7m1JZFPSvVRlBG 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*DO7m1JZFPSvVRlBG 640w, https://miro.medium.com/v2/resize:fit:720/0*DO7m1JZFPSvVRlBG 720w, https://miro.medium.com/v2/resize:fit:750/0*DO7m1JZFPSvVRlBG 750w, https://miro.medium.com/v2/resize:fit:786/0*DO7m1JZFPSvVRlBG 786w, https://miro.medium.com/v2/resize:fit:828/0*DO7m1JZFPSvVRlBG 828w, https://miro.medium.com/v2/resize:fit:1100/0*DO7m1JZFPSvVRlBG 1100w, https://miro.medium.com/v2/resize:fit:1400/0*DO7m1JZFPSvVRlBG 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="2f0e" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://www.linkedin.com/in/malayhaldar/" rel="noopener ugc nofollow" target="_blank">Malay Haldar</a>, <a class="af oi" href="https://www.linkedin.com/in/hongwei-zhang-86b15624/" rel="noopener ugc nofollow" target="_blank">Hongwei Zhang</a>, <a class="af oi" href="https://www.linkedin.com/in/kedar-bellare-3048128a/" rel="noopener ugc nofollow" target="_blank">Kedar Bellare</a> <a class="af oi" href="https://www.linkedin.com/in/sherrytchen/" rel="noopener ugc nofollow" target="_blank">Sherry Chen</a></p><p id="d1b7" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Search is the core mechanism that connects guests with Hosts at Airbnb. Results from a guest’s search for listings are displayed through two interfaces: (1) as a list of rectangular cards that contain the listing image, price, rating, and other details on it, referred to as <em class="oj">list-results</em> and (2) as oval pins on a map showing the listing price, called <em class="oj">map-results</em>. Since its inception, the core of the ranking algorithm that powered both these interfaces was the same — ordering listings by their booking probabilities and selecting the top listings for display.</p><p id="087b" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">But some of the basic assumptions underlying ranking, built for a world where search results are presented as lists, simply break down for maps.</p><h1 id="6fc0" class="ok ol gu bf om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bk">What Is Different About Maps?</h1><p id="0926" class="pw-post-body-paragraph mw mx gu my b mz pi nb nc nd pj nf ng nh pk nj nk nl pl nn no np pm nr ns nt gn bk">The central concept that drives ranking for list-results is that <em class="oj">user attention decays</em> starting from the top of the list, going down towards the bottom. A plot of rank vs click-through rates in Figure 1 illustrates this concept. X-axis represents the rank of listings in search results. Y-axis represents the click-through rate (CTR) for listings at the particular rank.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv pn"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*Y9drAzLenJ9GAYEA 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*Y9drAzLenJ9GAYEA 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*Y9drAzLenJ9GAYEA 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*Y9drAzLenJ9GAYEA 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*Y9drAzLenJ9GAYEA 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*Y9drAzLenJ9GAYEA 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*Y9drAzLenJ9GAYEA 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*Y9drAzLenJ9GAYEA 640w, https://miro.medium.com/v2/resize:fit:720/0*Y9drAzLenJ9GAYEA 720w, https://miro.medium.com/v2/resize:fit:750/0*Y9drAzLenJ9GAYEA 750w, https://miro.medium.com/v2/resize:fit:786/0*Y9drAzLenJ9GAYEA 786w, https://miro.medium.com/v2/resize:fit:828/0*Y9drAzLenJ9GAYEA 828w, https://miro.medium.com/v2/resize:fit:1100/0*Y9drAzLenJ9GAYEA 1100w, https://miro.medium.com/v2/resize:fit:1400/0*Y9drAzLenJ9GAYEA 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="po ff pp nu nv pq pr bf b bg z du">Figure 1: Click-through rates by listing search rank</figcaption></figure><p id="b486" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">To maximize the connections between guests and Hosts, the ranking algorithm sorts listings by their booking probabilities based on a <a class="af oi" href="https://www.airbnb.com/help/article/39" rel="noopener ugc nofollow" target="_blank">number of factors</a> and sequentially assigns their position in the list-results. This often means that the larger a listing’s booking probability, the more attention it receives from searchers.</p><p id="a432" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">But in map-results, listings are scattered as pins over an area (see Figure 2). There is no ranked list, and there is no decay of user attention by ranking position. Therefore, for listings that are shown on the map, the strategy of sorting by booking probabilities is no longer applicable.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div class="nu nv ps"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*6iaMrBpbSQjVnsLF 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*6iaMrBpbSQjVnsLF 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*6iaMrBpbSQjVnsLF 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*6iaMrBpbSQjVnsLF 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*6iaMrBpbSQjVnsLF 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*6iaMrBpbSQjVnsLF 1100w, https://miro.medium.com/v2/resize:fit:1248/format:webp/0*6iaMrBpbSQjVnsLF 1248w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 624px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*6iaMrBpbSQjVnsLF 640w, https://miro.medium.com/v2/resize:fit:720/0*6iaMrBpbSQjVnsLF 720w, https://miro.medium.com/v2/resize:fit:750/0*6iaMrBpbSQjVnsLF 750w, https://miro.medium.com/v2/resize:fit:786/0*6iaMrBpbSQjVnsLF 786w, https://miro.medium.com/v2/resize:fit:828/0*6iaMrBpbSQjVnsLF 828w, https://miro.medium.com/v2/resize:fit:1100/0*6iaMrBpbSQjVnsLF 1100w, https://miro.medium.com/v2/resize:fit:1248/0*6iaMrBpbSQjVnsLF 1248w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 624px" /></picture></div><figcaption class="po ff pp nu nv pq pr bf b bg z du">Figure 2: Map results</figcaption></figure><h1 id="6eef" class="ok ol gu bf om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bk">Uniform User Attention</h1><p id="dade" class="pw-post-body-paragraph mw mx gu my b mz pi nb nc nd pj nf ng nh pk nj nk nl pl nn no np pm nr ns nt gn bk">To adapt ranking to the map interface, we look at new ways of modeling user attention flow across a map. We start with the most straightforward assumption that user attention is spread equally across the map pins. User attention is a very precious commodity and most searchers only click through a few map pins (see Figure 3). A large number of pins on the map means those limited clicks may miss discovering the best options available. Conversely, limiting the number of pins to the topmost choices increases the probability of the searcher finding something suitable, but runs the risk of removing their preferred choice.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv pt"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*Vi5l4XPrl3YdHsP0 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*Vi5l4XPrl3YdHsP0 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*Vi5l4XPrl3YdHsP0 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*Vi5l4XPrl3YdHsP0 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*Vi5l4XPrl3YdHsP0 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*Vi5l4XPrl3YdHsP0 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*Vi5l4XPrl3YdHsP0 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*Vi5l4XPrl3YdHsP0 640w, https://miro.medium.com/v2/resize:fit:720/0*Vi5l4XPrl3YdHsP0 720w, https://miro.medium.com/v2/resize:fit:750/0*Vi5l4XPrl3YdHsP0 750w, https://miro.medium.com/v2/resize:fit:786/0*Vi5l4XPrl3YdHsP0 786w, https://miro.medium.com/v2/resize:fit:828/0*Vi5l4XPrl3YdHsP0 828w, https://miro.medium.com/v2/resize:fit:1100/0*Vi5l4XPrl3YdHsP0 1100w, https://miro.medium.com/v2/resize:fit:1400/0*Vi5l4XPrl3YdHsP0 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="po ff pp nu nv pq pr bf b bg z du">Figure 3: Number of distinct map pins clicked by percentage of searchers</figcaption></figure><p id="e5b2" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">We test this hypothesis, controlled by a parameter . The parameter serves as an upper bound on the ratio of the highest booking probability vs the lowest booking probability when selecting the map pins. The bounds set by the parameter controls the booking probability of the listings behind the map pins. The more restricted the bounds, the higher the average booking probability of the listings presented as map pins. Figure 4 summarizes the results from A/B testing a range of parameters.</p><p id="458a" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">The reduction in the average impressions to discovery metric in Figure 4 denotes the fewer number of map pins a searcher has to process before clicking the listing that they eventually book. Similarly, the reduction in average clicks to discovery shows the fewer number of map pins a searcher has to click through to find the listing they booked.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv nw"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*trGxNfKu4rHa4Gpx 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*trGxNfKu4rHa4Gpx 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*trGxNfKu4rHa4Gpx 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*trGxNfKu4rHa4Gpx 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*trGxNfKu4rHa4Gpx 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*trGxNfKu4rHa4Gpx 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*trGxNfKu4rHa4Gpx 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*trGxNfKu4rHa4Gpx 640w, https://miro.medium.com/v2/resize:fit:720/0*trGxNfKu4rHa4Gpx 720w, https://miro.medium.com/v2/resize:fit:750/0*trGxNfKu4rHa4Gpx 750w, https://miro.medium.com/v2/resize:fit:786/0*trGxNfKu4rHa4Gpx 786w, https://miro.medium.com/v2/resize:fit:828/0*trGxNfKu4rHa4Gpx 828w, https://miro.medium.com/v2/resize:fit:1100/0*trGxNfKu4rHa4Gpx 1100w, https://miro.medium.com/v2/resize:fit:1400/0*trGxNfKu4rHa4Gpx 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="po ff pp nu nv pq pr bf b bg z du">Figure 4: Exploring through online A/B experiments</figcaption></figure><p id="d078" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Launching the restricted version resulted in one of the largest bookings improvement in Airbnb ranking history. More importantly, the gains were not only for bookings, but for quality bookings. This could be seen by the increase in trips that resulted in 5-star rating after the stay from the treatment group, in comparison to trips from the control group.</p><h1 id="ca47" class="ok ol gu bf om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bk">Tiered User Attention</h1><p id="51d7" class="pw-post-body-paragraph mw mx gu my b mz pi nb nc nd pj nf ng nh pk nj nk nl pl nn no np pm nr ns nt gn bk">In our next iteration of modeling user attention, we separate the map pins into two tiers. The listings with the highest booking probabilities are displayed as regular oval pins with price. Listings with comparatively lower booking probabilities are displayed as smaller ovals without price, referred to as mini-pins (Figure 5). By design, mini-pins draw less user attention, with click-through rates about 8x less than regular pins.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*pkL4ovuWpR1Rz9z- 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*pkL4ovuWpR1Rz9z- 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*pkL4ovuWpR1Rz9z- 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*pkL4ovuWpR1Rz9z- 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*pkL4ovuWpR1Rz9z- 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*pkL4ovuWpR1Rz9z- 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*pkL4ovuWpR1Rz9z- 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*pkL4ovuWpR1Rz9z- 640w, https://miro.medium.com/v2/resize:fit:720/0*pkL4ovuWpR1Rz9z- 720w, https://miro.medium.com/v2/resize:fit:750/0*pkL4ovuWpR1Rz9z- 750w, https://miro.medium.com/v2/resize:fit:786/0*pkL4ovuWpR1Rz9z- 786w, https://miro.medium.com/v2/resize:fit:828/0*pkL4ovuWpR1Rz9z- 828w, https://miro.medium.com/v2/resize:fit:1100/0*pkL4ovuWpR1Rz9z- 1100w, https://miro.medium.com/v2/resize:fit:1400/0*pkL4ovuWpR1Rz9z- 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="po ff pp nu nv pq pr bf b bg z du">Figure 5: Oval pins with price and mini-pins</figcaption></figure><p id="b9d7" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">This comes in handy particularly for searches on desktop where 18 results are shown in a grid on the left, each of them requiring a map pin on the right (Figure 6).</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv nw"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*A83SEjyDlyTUCI06 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*A83SEjyDlyTUCI06 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*A83SEjyDlyTUCI06 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*A83SEjyDlyTUCI06 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*A83SEjyDlyTUCI06 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*A83SEjyDlyTUCI06 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*A83SEjyDlyTUCI06 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*A83SEjyDlyTUCI06 640w, https://miro.medium.com/v2/resize:fit:720/0*A83SEjyDlyTUCI06 720w, https://miro.medium.com/v2/resize:fit:750/0*A83SEjyDlyTUCI06 750w, https://miro.medium.com/v2/resize:fit:786/0*A83SEjyDlyTUCI06 786w, https://miro.medium.com/v2/resize:fit:828/0*A83SEjyDlyTUCI06 828w, https://miro.medium.com/v2/resize:fit:1100/0*A83SEjyDlyTUCI06 1100w, https://miro.medium.com/v2/resize:fit:1400/0*A83SEjyDlyTUCI06 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="po ff pp nu nv pq pr bf b bg z du">Figure 6: Search results on desktop</figcaption></figure><p id="86f1" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">The number of map pins is fixed in this case, and limiting them, as we did in the previous section, is not an option. Creating the two tiers prioritizes user attention towards the map pins with the highest probabilities of getting booked. Figure 7 shows the results of testing the idea through an online A/B experiment.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv pv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*1V-XbGegLzPch25O 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*1V-XbGegLzPch25O 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*1V-XbGegLzPch25O 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*1V-XbGegLzPch25O 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*1V-XbGegLzPch25O 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*1V-XbGegLzPch25O 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*1V-XbGegLzPch25O 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*1V-XbGegLzPch25O 640w, https://miro.medium.com/v2/resize:fit:720/0*1V-XbGegLzPch25O 720w, https://miro.medium.com/v2/resize:fit:750/0*1V-XbGegLzPch25O 750w, https://miro.medium.com/v2/resize:fit:786/0*1V-XbGegLzPch25O 786w, https://miro.medium.com/v2/resize:fit:828/0*1V-XbGegLzPch25O 828w, https://miro.medium.com/v2/resize:fit:1100/0*1V-XbGegLzPch25O 1100w, https://miro.medium.com/v2/resize:fit:1400/0*1V-XbGegLzPch25O 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="po ff pp nu nv pq pr bf b bg z du">Figure 7: Experiment results for tiered map pins</figcaption></figure><h1 id="ab81" class="ok ol gu bf om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bk">Discounted User Attention</h1><p id="f6ee" class="pw-post-body-paragraph mw mx gu my b mz pi nb nc nd pj nf ng nh pk nj nk nl pl nn no np pm nr ns nt gn bk">In our final iteration, we refine our understanding of how user attention is distributed over the map by plotting the click-through rate of map pins located at different coordinates on the map. Figure 8 shows these plots for the mobile (top) and the desktop apps (bottom).</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div class="nu nv pw"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*rDDubemWn97XvCN2 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*rDDubemWn97XvCN2 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*rDDubemWn97XvCN2 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*rDDubemWn97XvCN2 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*rDDubemWn97XvCN2 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*rDDubemWn97XvCN2 1100w, https://miro.medium.com/v2/resize:fit:1196/format:webp/0*rDDubemWn97XvCN2 1196w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 598px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*rDDubemWn97XvCN2 640w, https://miro.medium.com/v2/resize:fit:720/0*rDDubemWn97XvCN2 720w, https://miro.medium.com/v2/resize:fit:750/0*rDDubemWn97XvCN2 750w, https://miro.medium.com/v2/resize:fit:786/0*rDDubemWn97XvCN2 786w, https://miro.medium.com/v2/resize:fit:828/0*rDDubemWn97XvCN2 828w, https://miro.medium.com/v2/resize:fit:1100/0*rDDubemWn97XvCN2 1100w, https://miro.medium.com/v2/resize:fit:1196/0*rDDubemWn97XvCN2 1196w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 598px" /></picture></div></figure><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div class="nu nv px"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*I9GtvJEw5BGfHn96 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*I9GtvJEw5BGfHn96 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*I9GtvJEw5BGfHn96 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*I9GtvJEw5BGfHn96 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*I9GtvJEw5BGfHn96 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*I9GtvJEw5BGfHn96 1100w, https://miro.medium.com/v2/resize:fit:1204/format:webp/0*I9GtvJEw5BGfHn96 1204w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 602px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*I9GtvJEw5BGfHn96 640w, https://miro.medium.com/v2/resize:fit:720/0*I9GtvJEw5BGfHn96 720w, https://miro.medium.com/v2/resize:fit:750/0*I9GtvJEw5BGfHn96 750w, https://miro.medium.com/v2/resize:fit:786/0*I9GtvJEw5BGfHn96 786w, https://miro.medium.com/v2/resize:fit:828/0*I9GtvJEw5BGfHn96 828w, https://miro.medium.com/v2/resize:fit:1100/0*I9GtvJEw5BGfHn96 1100w, https://miro.medium.com/v2/resize:fit:1204/0*I9GtvJEw5BGfHn96 1204w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 602px" /></picture></div><figcaption class="po ff pp nu nv pq pr bf b bg z du">Figure 8: Click-through rates of map pins across map coordinates.</figcaption></figure><p id="66fc" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">To maximize the chances that a searcher will discover the listings with the highest booking probabilities, we design an algorithm that re-centers the map such that the listings with the highest booking probabilities appear closer to the center. The steps of this algorithm are illustrated in Figure 9, where a range of potential coordinates are evaluated and the one which is closer to the listings with the highest booking probabilities is chosen as the new center.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv py"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*IqlsENiSd-9IdQ5v 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*IqlsENiSd-9IdQ5v 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*IqlsENiSd-9IdQ5v 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*IqlsENiSd-9IdQ5v 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*IqlsENiSd-9IdQ5v 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*IqlsENiSd-9IdQ5v 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*IqlsENiSd-9IdQ5v 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*IqlsENiSd-9IdQ5v 640w, https://miro.medium.com/v2/resize:fit:720/0*IqlsENiSd-9IdQ5v 720w, https://miro.medium.com/v2/resize:fit:750/0*IqlsENiSd-9IdQ5v 750w, https://miro.medium.com/v2/resize:fit:786/0*IqlsENiSd-9IdQ5v 786w, https://miro.medium.com/v2/resize:fit:828/0*IqlsENiSd-9IdQ5v 828w, https://miro.medium.com/v2/resize:fit:1100/0*IqlsENiSd-9IdQ5v 1100w, https://miro.medium.com/v2/resize:fit:1400/0*IqlsENiSd-9IdQ5v 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="po ff pp nu nv pq pr bf b bg z du">Figure 9: Algorithm for finding optimal center</figcaption></figure><p id="c53d" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">When tested in an online A/B experiment, the algorithm improved uncancelled bookings by 0.27%. We also observed a reduction of 1.5% in map moves, indicating less effort from the searchers to use the map.</p><h1 id="9615" class="ok ol gu bf om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bk">Conclusion</h1><p id="95a0" class="pw-post-body-paragraph mw mx gu my b mz pi nb nc nd pj nf ng nh pk nj nk nl pl nn no np pm nr ns nt gn bk">Users interact with maps in a way that’s fundamentally different from interacting with items in a list. By modeling the user interaction with maps in a progressively sophisticated manner, we were able to improve the user experience for guests in the real world. However, the current approach has a challenge that remains unsolved: how can we represent the full range of available listings on the map? This is part of our future work. A more in-depth discussion of the topics covered here, along with technical details, is presented in our research paper that was <a class="af oi" href="https://arxiv.org/pdf/2407.00091" rel="noopener ugc nofollow" target="_blank">published at the <strong class="my gv">KDD ’24</strong> conference</a>. We welcome all feedback and suggestions.</p><p id="09f1" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">If this type of work interests you, we encourage you to apply for an<a class="af oi" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"> open position</a> today.</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/improving-search-ranking-for-maps-13b03f2c2cca</link>
      <guid>https://medium.com/airbnb-engineering/improving-search-ranking-for-maps-13b03f2c2cca</guid>
      <pubDate>Wed, 18 Dec 2024 19:02:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Airbnb at KDD 2024]]></title>
      <description><![CDATA[<div><div></div><p id="e81e" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Airbnb had a large presence at the 2024 KDD conference hosted in Barcelona, Spain. Our Data Scientist and Engineers presented on topics like Deep Learning &amp; Search Ranking, Online Experimentation &amp; Measurement, Product Quality &amp; Customer Journey, and Two-sided Marketplaces. This blog post summarizes our contributions to KDD for 2024 and provides access to the academic papers presented during the conference.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv nw"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*JnSzLDm3Uh2hY6c2 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*JnSzLDm3Uh2hY6c2 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*JnSzLDm3Uh2hY6c2 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*JnSzLDm3Uh2hY6c2 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*JnSzLDm3Uh2hY6c2 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*JnSzLDm3Uh2hY6c2 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*JnSzLDm3Uh2hY6c2 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*JnSzLDm3Uh2hY6c2 640w, https://miro.medium.com/v2/resize:fit:720/0*JnSzLDm3Uh2hY6c2 720w, https://miro.medium.com/v2/resize:fit:750/0*JnSzLDm3Uh2hY6c2 750w, https://miro.medium.com/v2/resize:fit:786/0*JnSzLDm3Uh2hY6c2 786w, https://miro.medium.com/v2/resize:fit:828/0*JnSzLDm3Uh2hY6c2 828w, https://miro.medium.com/v2/resize:fit:1100/0*JnSzLDm3Uh2hY6c2 1100w, https://miro.medium.com/v2/resize:fit:1400/0*JnSzLDm3Uh2hY6c2 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="ec1b" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: <a class="af oi" href="mailto:huiji.gao@airbnb.com" rel="noopener ugc nofollow" target="_blank">Huiji Gao</a>, <a class="af oi" href="mailto:peter.coles@airbnb.com" rel="noopener ugc nofollow" target="_blank">Peter Coles</a>, <a class="af oi" href="mailto:carolina.barcenas@airbnb.com" rel="noopener ugc nofollow" target="_blank">Carolina Barcenas</a>, <a class="af oi" href="mailto:sanjeev.katariya@airbnb.com" rel="noopener ugc nofollow" target="_blank">Sanjeev Katariya</a></p><p id="a038" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://kdd.org/" rel="noopener ugc nofollow" target="_blank">KDD</a> (Knowledge and Data Mining) is one of the most prestigious global conferences in data mining and machine learning. Hosted annually by a special interest group of the Association for Computing Machinery (ACM), it’s where attendees learn about some of the most ground-breaking AI developments in data mining, machine learning, knowledge discovery, and large-scale data analytics.</p><p id="6289" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">This year, the 30th KDD conference was held at Barcelona, Spain, attracting thousands of researchers and scientists from academia and industry. Various companies contributed to and attended the conference including Google, Meta, Apple, Amazon, Airbnb, Pinterest, LinkedIn, Booking, Expedia, ByteDance etc. There were 151 Applied Data Science (ADS) track papers and 411 Research track papers accepted, 34 tutorials, and 30 workshops.</p><p id="20e0" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Airbnb had a significant presence at KDD 2024 with three full <a class="af oi" href="https://kdd2024.kdd.org/applied-data-science-track-papers" rel="noopener ugc nofollow" target="_blank">ADS track</a> papers (acceptance rate under 20%), one workshop, and seven workshop papers and invited talks accepted into the main conference proceedings. The topics of our work spanned Deep learning &amp; Search Ranking, Online Experimentation &amp; Measurement, Causal Inference &amp; Machine Learning, and Two-sided Marketplaces.</p><p id="1ce8" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In this blog post, we will summarize our teams’ contributions and share highlights from an exciting week-long conference with research and industry talks, workshops, panel discussions, and more.</p><h1 id="0a84" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk"><strong class="al">Deep Learning and Search Ranking</strong></h1><p id="8bbb" class="pw-post-body-paragraph mw mx gu my b mz ph nb nc nd pi nf ng nh pj nj nk nl pk nn no np pl nr ns nt gn bk">Intelligent search ranking — the process of accurately matching a guest with a listing based on their preference, a listing’s features, and additional search context — still remains a nuanced challenge that researchers are constantly trying to solve.</p><p id="f211" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Making optimal guest-host matches has remained an issue in a two-sided marketplace for a variety of reasons — the timespan of guest searches (ranging between days and weeks), unpredictable host behavior and ratings (the potential for hosts to cancel a booking or receive low ratings), and limited understanding of guest preference across multiple interfaces. We published several papers addressing the issue of search ranking as part of our presence at KDD.</p><p id="25ee" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://arxiv.org/abs/2407.00091" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">Learning to Rank for Maps at Airbnb</strong></a></p><p id="dadd" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Airbnb brings together hosts who rent listings to prospective guests from around the globe. Results from a guest’s search for listings are displayed primarily through two interfaces: (1) as a list of rectangular cards that contain on them the listing image, price, rating, and other details, referred to as list-results, and (2) as oval pins on a map showing the listing price, called map-results. Both these interfaces, since their inception, have used the same ranking algorithm that orders listings by their booking probabilities and selects the top listings for display.</p><p id="22ea" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">However, some of the basic assumptions underlying ranking are built for a world where search results are presented as lists and simply break down for map-results. In this work, we rebuilt ranking for maps by revising the mathematical foundations of how users interact with map search results. Our iterative and experiment-driven approach led us through a path full of twists and turns, ending in a unified theory for the two interfaces.</p><p id="825a" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Our journey shows how assumptions taken for granted when designing machine learning algorithms may not apply equally across all user interfaces, and how they can be adapted. The net impact was one of the largest improvements in user experience for Airbnb which we discuss as a series of experimental validations. The work introduced in this paper is merely the beginning of future exciting research projects, such as making learning to rank unbiased for map-results and demarcating the map pins to direct the user attention towards more relevant ones.</p><p id="e93f" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://arxiv.org/abs/2407.07181" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">Multi-objective Learning to Rank by Model Distillation</strong></a></p><p id="1903" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In online marketplaces, the objective of search ranking is not only on optimizing purchasing or conversion rate (primary objective), but also the purchase outcomes (secondary objectives), e.g. order cancellation, review rating, customer service inquiries, platform long term growth. To balance these primary and secondary objectives, several multi-objective learning to rank approaches have been widely studied</p><p id="4d38" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Traditional approaches in industrial search and recommender systems encounter challenges such as expensive parameter tuning that leads to sub-optimal solutions, suffering from imbalanced data sparsity issues, and lack of compatibility with ad-hoc objectives. In this work, we propose a distillation-based ranking solution for multi-objective ranking, which optimizes the end-to-end ranking system at Airbnb across multiple ranking models on different objectives, along with various considerations to optimize training and serving efficiency that meets industry standards.</p><p id="23dc" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Compared with traditional approaches, the proposed solution not only significantly meets and increases the primary objective of conversion by a large margin, but also addresses the secondary objective constraints while improving model stability. Furthermore, we demonstrated the proposed system could be further simplified by model self-distillation. We also did additional simulations to show that this approach could help us efficiently inject ad-hoc non-differentiable business objectives into the ranking system, while enabling us to balance our optimization objectives.</p><h1 id="f7ba" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk"><strong class="al">Online Experimentation and Measurement</strong></h1><p id="b5bc" class="pw-post-body-paragraph mw mx gu my b mz ph nb nc nd pi nf ng nh pj nj nk nl pk nn no np pl nr ns nt gn bk">Online experimentation (e.g., A/B testing) is a common way for organizations like Airbnb to make data-driven decisions. But high variance is frequently a challenge. For example, it’s hard to prove that a change in our search UX will drive value because bookings can be infrequent and depend on a large number of interactions over a long period of time.</p><p id="05b9" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://dl.acm.org/doi/pdf/10.1145/3637528.3671556" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">Metric Decomposition in A/B Tests</strong></a></p><p id="605f" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">More than a decade ago, CUPED (Controlled Experiments Utilizing Pre-Experiment Data) mainstreamed the idea of variance reduction leveraging pre-experiment covariates. Since its introduction, it has been implemented, extended, and modernized by major online experimentation platforms. Despite the wide adoption, it is known by practitioners that the variance reduction rate from CUPED, utilizing pre-experimental data, varies case by case and has a theoretical limit. In theory, CUPED can be extended to augment a treatment effect estimator utilizing in-experiment data, but practical guidance on how to construct such an augmentation is lacking.</p><p id="0d99" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In this work, we fill this gap by proposing a new direction for sensitivity improvement via treatment effect augmentation, whereby a target metric of interest is decomposed into</p><p id="4656" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">two or more components in an attempt to isolate those with high signal and low noise from those with low signal and high noise. We show through theory, simulation, and empirical examples that if such a decomposition exists (or can be engineered), sensitivity may be increased via approximately null augmentation (in a frequentist setting) and reduced posterior variance (in a Bayesian setting).</p><p id="eb97" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">We provide three real world applications demonstrating different flavors of metric decomposition. These applications illustrate the gain in agility metric decomposition yields relative to an un-decomposed analysis, indicating both empirically and theoretically the value of this practice in both frequentist and Bayesian settings. An important extension to this work would be to next consider sample size determination in both the frequentist or Bayesian contexts; while a boost in sensitivity typically means less data is required for a given analysis, a methodology that determines the smallest sample size required to control various operating characteristics in this context would be of practical value.</p><h1 id="5546" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">Two-sided Marketplace Optimization</h1><p id="7199" class="pw-post-body-paragraph mw mx gu my b mz ph nb nc nd pi nf ng nh pj nj nk nl pk nn no np pl nr ns nt gn bk">Airbnb employees hosted a workshop on <a class="af oi" href="https://sites.google.com/view/tsmo2024/home?authuser=0" rel="noopener ugc nofollow" target="_blank">Two-sided Marketplace Optimization: Search, Pricing, Matching &amp; Growth</a>. This workshop brought practitioners of two-sided marketplaces together and discussed the evolution of content ranking, recommendation systems, and data mining when solving for producers and consumers on these platforms.</p><p id="70ec" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Two-sided marketplaces have recently emerged as viable business models for many real-world applications. They model transactions as a network with two distinct types of participants: one type to represent the supply and another the demand of a specific good. Traditionally, research related to online marketplaces focused on how to better satisfy demand. But with two-sided marketplaces, there is more nuance at play. Modern global examples, like Airbnb, operate platforms where users provide services; users may be hosts,or guests. Such platforms must develop models that address all their users’ needs and goals at scale. Machine learning-powered methods and algorithms are essential in every aspect of such complex, internet-scale-sized, two-sided marketplaces.</p><p id="c7d4" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Airbnb is a community based on connection and belonging–we strive to connect people and places. Our contributions to this workshop showcase the work we’re doing to support this mission by optimizing guest experiences, finding equilibrium spots for listing prices, reducing the incidence of poor interactions (and customer support costs as a side effect), detecting when operational staff should follow up on activity at scale, and more.</p><p id="4ad8" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Understanding-User-Booking-Intent-at-Airbnb.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">Guest Intention Modeling for Personalization</strong></a></p><p id="6b36" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Airbnb has transformed the way people travel by offering unique and personalized stays in destinations worldwide. To provide a seamless and tailored experience, understanding user intent plays an important role.</p><p id="fd91" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">However, limited user data and unpredictable guest behavior can make it difficult to understand the essential intent from guests on listings from hosts. Our work shows how we approach this challenging problem. We describe how we apply a deep learning approach to predict difficult-to-infer details for a user’s travel plan, such as the next destination and travel dates. The framework analyzes high-level information from users’ in-app browsing history, booking history, search queries, and other engagement signals, and produces multiple user intent signals.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv pm"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*9QYBiZCmirD4aV1o 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*9QYBiZCmirD4aV1o 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*9QYBiZCmirD4aV1o 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*9QYBiZCmirD4aV1o 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*9QYBiZCmirD4aV1o 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*9QYBiZCmirD4aV1o 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*9QYBiZCmirD4aV1o 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*9QYBiZCmirD4aV1o 640w, https://miro.medium.com/v2/resize:fit:720/0*9QYBiZCmirD4aV1o 720w, https://miro.medium.com/v2/resize:fit:750/0*9QYBiZCmirD4aV1o 750w, https://miro.medium.com/v2/resize:fit:786/0*9QYBiZCmirD4aV1o 786w, https://miro.medium.com/v2/resize:fit:828/0*9QYBiZCmirD4aV1o 828w, https://miro.medium.com/v2/resize:fit:1100/0*9QYBiZCmirD4aV1o 1100w, https://miro.medium.com/v2/resize:fit:1400/0*9QYBiZCmirD4aV1o 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="563f" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Marketing emails, flexible travel search (e.g., for “Europe in the summer”), and recommendations on the app home page are three guest interactions that benefit from correct intention modeling. Hosts also benefit, since a clear understanding of guest demand can help them optimize listings to increase satisfaction and bookings.</p><p id="ac50" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Understanding-Guest-Preferences-and-Optimizing-.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">Guest Demand Understanding</strong></a></p><p id="a123" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Hosts can find it difficult to correctly price their listings in two-sided marketplaces serviced by end users. Most hosts are not professional hospitality workers, and would benefit from access to data and advice on how guests see their listings and how they compare to other listings in their neighborhood. We constantly look for ways to give guidance on how hosts can optimally price their listings. The same information can then be used to help guests find their ideal stay.</p><p id="3b70" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In our paper, we presented an example of how this problem can be solved in general.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div class="nu nv pn"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*-3Mj0gacSPA56ZQx 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*-3Mj0gacSPA56ZQx 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*-3Mj0gacSPA56ZQx 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*-3Mj0gacSPA56ZQx 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*-3Mj0gacSPA56ZQx 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*-3Mj0gacSPA56ZQx 1100w, https://miro.medium.com/v2/resize:fit:1312/format:webp/0*-3Mj0gacSPA56ZQx 1312w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 656px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*-3Mj0gacSPA56ZQx 640w, https://miro.medium.com/v2/resize:fit:720/0*-3Mj0gacSPA56ZQx 720w, https://miro.medium.com/v2/resize:fit:750/0*-3Mj0gacSPA56ZQx 750w, https://miro.medium.com/v2/resize:fit:786/0*-3Mj0gacSPA56ZQx 786w, https://miro.medium.com/v2/resize:fit:828/0*-3Mj0gacSPA56ZQx 828w, https://miro.medium.com/v2/resize:fit:1100/0*-3Mj0gacSPA56ZQx 1100w, https://miro.medium.com/v2/resize:fit:1312/0*-3Mj0gacSPA56ZQx 1312w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 656px" /></picture></div></figure><p id="8bf7" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">As illustrated above, both demand and supply change over time, influencing the equilibrium price for a property at a specific point. A historical optimum (such as A above) has to be adjusted to find the current optimum (point C). It is difficult to run experiments since any large-scale experiment we might run will cause the environment to change in complex ways. We tackle this problem by combining economic modeling with causal inference techniques. We segment guests and estimate how price-sensitive each guest segment is, and fine-tune them with empirical data from small targeted experiments and larger-scale natural ones, which are used to adjust estimates for the price sensitivity of each guest segment. Hosts can then use the models’ output to make informed tradeoffs between higher occupancy and higher nightly rates.</p><p id="c53a" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Learning-and-Applying-Airbnb-Listing-Embeddings-.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">Listing Embedding for Host-side Products</strong></a></p><p id="ffda" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In order to facilitate the matching of listings and guests, Airbnb provides numerous products and services to both hosts and guests. Many of these tools are based on the ability to compare listings, i.e. finding similar listings or listings that may be viewed as equivalent substitutes. Our work presents a study on the application and learning of listing embeddings in Airbnb’s two-sided marketplace. Specifically, we discuss the architecture and training of a neural network embedding model using guest side engagement data, which is then applied to host-side product surfaces. We address the key technical challenges we encountered, including the formulation of negative training examples, correction of training data sampling bias, and the scaling and speeding up training with the help of in-model caching. Additionally, we discuss our comprehensive approach to evaluation, which ranges from in-batch metrics and vocabulary-based evaluation to the properties of similar listings. Finally, we share our insights from utilizing listing embeddings in Airbnb products, such as host calendar similar listings.</p><p id="237a" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Predicting-Potential-Customer-Support-Needs-and-OptimizingSearch-Ranking.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">Customer Support Optimization in Search Ranking</strong></a></p><p id="107f" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">As of the date of the paper, Airbnb had more than 7.7 million listings from more than 5 million hosts worldwide. Airbnb is investing both in rapid growth and in making sure that the booking experience is pleasant for hosts and guests. It would, however, be ideal to avoid poor experiences in the first place. Our work highlights how we prevent poor experiences without significantly reducing growth.</p><p id="f09a" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">We use the mass of accumulated support data at Airbnb to model the probability that, if the current user were to book a listing, they would require CS support. Our model discovered multiple features about the searcher, home, and hosts that accurately predict CS requirements. For example, same-day bookings tend to require more support, and a responsive host tends to reduce support needs. So, if a guest chooses a same-day booking, matching them with a highly responsive host can lead to a better experience overall. We incorporate the output of our CS support model in search result rankings; booked homes will sometimes rank lower if we predict a booking will lead to a negative experience.</p><p id="2de6" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Can-Language-Models-Accelerate-Prototyping-for-Non-Language-Data-Classification-Summarization-of-Activity-Logs-as-Text.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">LLM Pretraining using Activity Logs</strong></a></p><p id="4c19" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">It’s often important to follow up with users after they’ve had a long series of interactions with a two-sided marketplace to help make sure that their experiences are of high quality. When user interactions meet certain business criteria, operations agents create tickets to follow up with them. For example, user retention and reactivation agents might review user activity logs and decide to follow up with the user, to encourage them to re-engage with the platform.</p><p id="2a4b" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">We propose transforming structured data (activity logs) into a more manageable text format and then leveraging modern language models (i.e., BERT) to pretrain a large language model based on user activities. We then performed fine-tuning on the model using historical data about which users were followed up with and checked its predictions. Our work demonstrates the large language model trained on pre-processed activity can successfully identify when a user should be followed up with, at an experimentally significant rate. Our preliminary results suggest that our framework may outperform by 80% the average precision of a similar model that was designed relying heavily on feature engineering.</p><h1 id="2da2" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">Product Quality and Customer Journey Optimization</h1><p id="1c72" class="pw-post-body-paragraph mw mx gu my b mz ph nb nc nd pi nf ng nh pj nj nk nl pk nn no np pl nr ns nt gn bk">Typically, product quality is evaluated based on structured data. Customer ratings, types of support issues, resolution times, and other factors are used as a proxy for how someone booking on Airbnb might value a listing. This kind of data has limitations — more popular listings have more data, often users don’t leave feedback, and feedback is usually biased towards the positive (users with negative experiences tend to churn and not give feedback).</p><p id="0380" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In the Workshop on Causal Inference and Machine Learning in Practice, we highlighted an example of how we push the boundaries of product quality assessment techniques and applications, mixing traditional casual inference with cutting-edge machine learning research. In our work “<a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Understanding-Product-Quality-with-Unstructured-Data.pdf" rel="noopener ugc nofollow" target="_blank">Understanding Product Quality with Unstructured Data: An Application of LLMs and Embeddings at Airbnb</a>”, we presented how an approach based on text embeddings and LLMs can be combined with approaches based on structured data to significantly improve product quality evaluations. We generate text embeddings on a mix of listing and review texts, then cluster the embeddings based on rebooking and churn rates. Once we have clear clusters, we extract keywords from the original data, and use these keywords to calculate a listing quality score, based on their similarity to the keyword list.</p><p id="6763" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In addition, we were invited to give a talk <a class="af oi" href="https://sites.google.com/view/kdd-workshop-2023" rel="noopener ugc nofollow" target="_blank">on Quality Foundations at Airbnb</a>, at KDD’s 3r<a class="af oi" href="https://sites.google.com/view/kdd-workshop-2023" rel="noopener ugc nofollow" target="_blank">d Workshop on End-End Customer Journey Optimization.</a> It’s often hard to differentiate the quality of customer experiences using simple review ratings, in part due to the tightness of their distribution. In this talk, we present an alternative notion of quality based on customer revealed preference: did a customer return to use the platform again after their experience? We describe how a metric — Guest Return Propensity (GRP) — leverages this concept and can differentiate quality, capture platform externalities, and predict future returns.</p><p id="a984" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In practice, this measure may not be suited to many common business use cases due to its lagging nature and an inability to easily explain why it has changed. We describe a quality measurement system that builds on the conceptual foundation of GRP by modeling it as an outcome of upstream realized quality signals. These signals — from sources like reviews and customer support — are weighted by their impact on return propensity and mapped to a quality taxonomy to aid in explainability. The resulting score is capable of finely differentiating the quality of customer experiences, aiding tradeoff decisions, and providing timely insights.</p><h1 id="427f" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">Conclusion</h1><p id="807d" class="pw-post-body-paragraph mw mx gu my b mz ph nb nc nd pi nf ng nh pj nj nk nl pk nn no np pl nr ns nt gn bk">The 2024 edition of KDD was an amazing opportunity for data scientists and machine learning engineers from across the globe and industry, government, and academia, to connect and exchange learnings and discoveries. We were honored to have the opportunity to share some of our knowledge and techniques, generalizing what we have been learning when we apply machine learning to problems we see at Airbnb. We continue to focus on improving our customers’ experience and growing our business, and the information we’ve shared has been crucial to our success. We’re excited to continue learning from peers and contribute our work back to our community. We eagerly await advancements and improvements that might come about as others build upon the work we’ve shared.</p><p id="47c2" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Below, you’ll find a complete list of the talks and papers shared in this article along with the team members who contributed. If this type of work interests you, we encourage you to apply for an<a class="af oi" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"> open position</a> today.</p><h1 id="2ba1" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">List of papers and talks</h1><p id="a0b3" class="pw-post-body-paragraph mw mx gu my b mz ph nb nc nd pi nf ng nh pj nj nk nl pk nn no np pl nr ns nt gn bk"><strong class="my gv">Learning to Rank for Maps at Airbnb (</strong><a class="af oi" href="https://dl.acm.org/doi/10.1145/3637528.3671648" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">link</strong></a><strong class="my gv">)</strong></p><p id="c177" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: Malay Haldar, Hongwei Zhang, Kedar Bellare, Sherry Chen, Soumyadip Banerjee, Xiaotang Wang, Mustafa Abdool, Huiji Gao, Pavan Tapadia, Liwei He, Sanjeev Katariya</p><p id="cf4a" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">Multi-objective Learning to Rank by Model Distillation (</strong><a class="af oi" href="https://dl.acm.org/doi/10.1145/3637528.3671597" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">link</strong></a><strong class="my gv">)</strong></p><p id="3764" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: Jie Tang, Huiji Gao, Liwei He, Sanjeev Katariya</p><p id="8c47" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">Metric Decomposition in A/B Tests (</strong><a class="af oi" href="https://dl.acm.org/doi/10.1145/3637528.3671556" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">link</strong></a><strong class="my gv">)</strong></p><p id="f246" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: Alex Deng (former employee at Airbnb), Luke Hagar (University of Waterloo), Nathaniel T. Stevens (University of Waterloo), Tatiana Xifara (Airbnb), Amit Gandhi (University of Pennsylvania)</p><p id="eb18" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">Understanding Guest Preferences and Optimizing Two-sided Marketplaces: Airbnb as an Example (</strong><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Understanding-Guest-Preferences-and-Optimizing-.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">link</strong></a><strong class="my gv">)</strong></p><p id="a0d4" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: Yufei Wu, Daniel Shmierer</p><p id="37cf" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">Predicting Potential Customer Support Needs and Optimizing Search Ranking in a Two-Sided Marketplace (</strong><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Predicting-Potential-Customer-Support-Needs-and-OptimizingSearch-Ranking.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">link</strong></a><strong class="my gv">)</strong></p><p id="c490" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: Do-kyum Kim, Han Zhao, Huiji Gao, Liwei He, Malay Haldar, Sanjeev Katariya</p><p id="2fad" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">​​Understanding User Booking Intent at Airbnb (</strong><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Understanding-User-Booking-Intent-at-Airbnb.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">link</strong></a><strong class="my gv">)</strong></p><p id="dcfd" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: Xiaowei Liu, Weiwei Guo, Jie Tang, Sherry Chen, Huiji Gao, Liwei He, Pavan Tapadia, Sanjeev Katariya</p><p id="8513" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">Can Language Models Accelerate Prototyping for Non-Language Data? Classification &amp; Summarization of Activity Logs as Text (</strong><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Can-Language-Models-Accelerate-Prototyping-for-Non-Language-Data-Classification-Summarization-of-Activity-Logs-as-Text.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">link</strong></a><strong class="my gv">)</strong></p><p id="d1e0" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: José González-Brenes</p><p id="ebc4" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">Learning and Applying Airbnb Listing Embeddings in Two-Sided Marketplace (</strong><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Learning-and-Applying-Airbnb-Listing-Embeddings-.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">link</strong></a><strong class="my gv">)</strong></p><p id="0063" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: Siarhei Bykau, Dekun Zou</p><p id="ff3a" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">Understanding Product Quality with Unstructured Data: An Application of LLMs and Embeddings at Airbnb (</strong><a class="af oi" href="https://airbnb.tech/wp-content/uploads/sites/19/2024/12/Understanding-Product-Quality-with-Unstructured-Data.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="my gv">link</strong></a><strong class="my gv">)</strong></p><p id="eb06" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Authors: Jikun Zhu, Zhiying Gu, Brad Li, Linsha Chen</p><p id="e276" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">Invited Talk: Quality Foundations at Airbnb</strong></p><p id="9aaa" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Speakers: Peter Coles, Mike Egesdal</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/airbnb-at-kdd-2024-d5c2fa81a119</link>
      <guid>https://medium.com/airbnb-engineering/airbnb-at-kdd-2024-d5c2fa81a119</guid>
      <pubDate>Tue, 17 Dec 2024 19:02:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey To Airbnb | Vijaya Kaza]]></title>
      <description><![CDATA[<div><div></div><figure class="mz na nb nc nd ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx my"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*js3cWnfq81siNVRy 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*js3cWnfq81siNVRy 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*js3cWnfq81siNVRy 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*js3cWnfq81siNVRy 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*js3cWnfq81siNVRy 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*js3cWnfq81siNVRy 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*js3cWnfq81siNVRy 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*js3cWnfq81siNVRy 640w, https://miro.medium.com/v2/resize:fit:720/0*js3cWnfq81siNVRy 720w, https://miro.medium.com/v2/resize:fit:750/0*js3cWnfq81siNVRy 750w, https://miro.medium.com/v2/resize:fit:786/0*js3cWnfq81siNVRy 786w, https://miro.medium.com/v2/resize:fit:828/0*js3cWnfq81siNVRy 828w, https://miro.medium.com/v2/resize:fit:1100/0*js3cWnfq81siNVRy 1100w, https://miro.medium.com/v2/resize:fit:1400/0*js3cWnfq81siNVRy 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="61bb" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><a class="af oi" href="https://www.linkedin.com/in/vkaza/" rel="noopener ugc nofollow" target="_blank"><em class="oj">Vijaya Kaza</em></a><em class="oj"> is the Chief Security Officer and Head of Engineering for Trust and Safety at Airbnb. She leads teams responsible for developing the technology (Platforms, tools and AI models), to safeguard the Airbnb community, as well as for securing Airbnb’s infrastructure and information assets. She is also the executive co-sponsor of Airbnb Tech’s Diversity Council.</em></p><p id="e868" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="oj">Here’s Vijaya’s story of how she got to Airbnb, in her own words.</em></p><p id="1e9b" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><strong class="nm gv">Straight shot to science and engineering</strong></p><p id="945d" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">I grew up in a modest, multi-generational family in India with 30 to 40 family members under one roof on any given day. As the oldest child in that house, I was expected to excel academically and set an example for the other children to follow.</p><p id="6849" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">In our culture back then, being “good at school” was synonymous with shining in science and math. As luck would have it, I had a strong affinity for those subjects and enjoyed studying and diving deep into them. I followed a natural path that combined math and science, studying engineering in college, and got a bachelor’s and two master’s in electrical engineering. This foundation paved the way for my future work in technology.</p><p id="4fed" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><strong class="nm gv">Stumbling into cybersecurity</strong></p><p id="b5d9" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">After college, I landed my first job as a Software Engineer at Cisco. However, my entry into the security field was accidental. I simply followed a manager I liked who was moving into a new security business unit, fell in love with Security and never looked back after that! There was no grand plan or calculated career strategy.</p><p id="e7a0" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">After 17 years at Cisco leading product development for a $1B security product portfolio, I headed to FireEye, another well-known name in the Cybersecurity space. There I had the responsibility for helping the company transition from on-prem to a cloud/SaaS business model, and growing the revenue of their cloud security portfolio. That role gave me the experience of working on different areas of security, as well as leading both Product Management and Engineering in a General Manager capacity.</p><p id="0fa7" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Next, I led Product Development at Lookout, a startup in San Francisco focused on mobile security. At the time, it was a consumer security company pivoting to building for enterprise customers. I didn’t know it then, but that glimpse of consumer security was a great primer for my eventual role at Airbnb.</p><p id="865d" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Cybersecurity is an ever-changing domain with constant innovation and I’ve thoroughly enjoyed working and learning in such a dynamic space. Each major technological transformation — from cloud to mobile to AI — brings novel security challenges to solve for businesses and end users.</p><figure class="ol om on oo op ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx ok"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*7dLlEcMROo22Bn4C244qbw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*7dLlEcMROo22Bn4C244qbw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*7dLlEcMROo22Bn4C244qbw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*7dLlEcMROo22Bn4C244qbw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*7dLlEcMROo22Bn4C244qbw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*7dLlEcMROo22Bn4C244qbw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*7dLlEcMROo22Bn4C244qbw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*7dLlEcMROo22Bn4C244qbw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*7dLlEcMROo22Bn4C244qbw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*7dLlEcMROo22Bn4C244qbw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*7dLlEcMROo22Bn4C244qbw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*7dLlEcMROo22Bn4C244qbw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*7dLlEcMROo22Bn4C244qbw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*7dLlEcMROo22Bn4C244qbw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="9a8f" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><strong class="nm gv">An unexpected opportunity at Airbnb</strong></p><p id="2c37" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">When I was initially approached for the Chief Security Officer (CSO) role at Airbnb, I was taken aback. I had always worked in the field doing engineering and product development work, so I wasn’t sure about this role. But after meeting Ari Balogh, Airbnb’s Chief Technology Officer (CTO) for an informal coffee chat, we really hit it off and I was thoroughly impressed by his vision to transform the engineering and technology organization within the company. Ari shared his philosophy of focusing on the <a class="af oi" rel="noopener" href="https://medium.com/airbnb-engineering/commitment-to-craft-e36d5a8efe2a">craft of engineering</a> and that really resonated with me. The prospect of molding Airbnb’s engineering culture was very enticing.</p><p id="d29f" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Turns out Airbnb was also looking for someone to lead engineering for its Trust and Safety organization. Given my background with a blend of engineering and security domain expertise, I ended up taking on both of these roles. This unique opportunity allowed me to go back to my roots leading engineering teams while taking advantage of my security experience in the capacity of a CSO.</p><p id="c411" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">When I joined the company in 2019, I was struck by Airbnb’s dedication and effort to deliver a positive user experience. Our attention and design focus that go into helping guests and hosts have a seamless experience is unlike anything I’ve experienced before.</p><p id="bf46" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><strong class="nm gv">Why Airbnb?</strong></p><p id="92be" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">I joined Airbnb because of the opportunity to have an outsized impact — Airbnb’s almost 6,000 employees serve millions of people worldwide. Our mission-driven approach, powered by our founders’ unmatched passion and commitment to doing good through initiatives like Airbnb.org really resonated with me. I’ve been continuously impressed by the caliber of talented, caring people here who are united by Airbnb’s vision. It’s rare to find a company that so effectively combines technical excellence with social consciousness.</p><p id="96eb" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><strong class="nm gv">Two teams, one mission</strong></p><p id="b333" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">While both the Trust and Safety and Security teams share the common mission of safeguarding users and the platform, the actual techniques, threats, and focus areas are completely different. Trust is at the core of our business and we’re deeply focused on providing our guests and hosts peace of mind as they live, work, travel and host on Airbnb. We build technology to help lower safety and privacy risks for our community.</p><p id="aba7" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">For example, our innovative reservation screening technology aims to help reduce the risk of disruptive parties on Airbnb globally by taking steps to identify higher-risk reservations and potentially prevent these bookings from being made.</p><p id="a04d" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">On the cybersecurity front, we focus on securing Airbnb’s assets, our data, our employees, and our infrastructure. We implement robust security controls and threat detection capabilities to safeguard Airbnb’s internal resources.</p><p id="58e9" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><strong class="nm gv">Embracing the improv mindset</strong></p><p id="d9af" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Outside of work, I’ve pursued some hobbies that have surprisingly imparted invaluable leadership lessons. A few years ago, I decided to try improv comedy, something that’s entirely outside my wheelhouse. I had never done anything close to theater or acting in my entire life. What started as a fun experiment quickly became a passion project and I progressed through the levels and eventually performed live in front of an audience with friends and family in attendance.</p><p id="9a4a" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">It may seem unrelated at first, but the very nature of improv is profoundly relevant for leadership. You’re constantly put on the spot and need to think on your feet, responding to new questions and scenarios in the moment. Improv trains this vital skill of processing information in real time and formulating a coherent, compelling reaction.</p><p id="252e" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><strong class="nm gv">Keep a steady head</strong></p><p id="df0b" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Over any career, including mine, there are inevitable professional setbacks, disappointments, and annoyances along the way. The key is to not make much of these short-term hurdles; it’s how you respond that matters most. The less you agonize over the bumps in the road, the better. Maintain your focus, keep one foot in front of the other, and persist forward undeterred. I learned these lessons early, having taken on leadership roles from a young age as the eldest child in a large household. My advice is to keep a steady head, maintain perspective, and plow forward with conviction.</p><p id="9785" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">We’re currently expanding the Airbnb team and hiring for several roles. Check out our open positions <a class="af oi" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">here</a>.</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-vijaya-kaza-8f06543b38d5</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-vijaya-kaza-8f06543b38d5</guid>
      <pubDate>Thu, 05 Dec 2024 19:36:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[From Data to Insights: Segmenting Airbnb’s Supply]]></title>
      <description><![CDATA[<div class="gn go gp gq gr"><div class="ab cb"><div class="ci bh fz ga gb gc"><div><div><h2 id="dc21" class="pw-subtitle-paragraph hr gt gu bf b hs ht hu hv hw hx hy hz ia ib ic id ie if ig cq du">How Airbnb uses data-driven segmentation to understand supply availability patterns.</h2><div></div><p id="628a" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk"><strong class="nk gv">By:</strong> <a class="af oe" href="https://www.linkedin.com/in/alexandre-salama" rel="noopener ugc nofollow" target="_blank">Alexandre Salama</a>, <a class="af oe" href="https://www.linkedin.com/in/timabraham" rel="noopener ugc nofollow" target="_blank">Tim Abraham</a></p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og oh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*Lqx5Zg187zZY1UewaG7Q7Q.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="fb4a" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Introduction</h1><p id="ad10" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">At Airbnb, our supply comes from hosts who decide to list their spaces on our platform. Unlike traditional hotels, these spaces are not all interchangeable units in a building that are available to book year-round. Our hosts are people, with different earnings objectives and schedule constraints — leading to different levels of availability to host. Understanding these differences is a key input into how we develop our products, campaigns, and operations.</p><p id="aea4" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Over the years, we’ve created various ways to measure host availability, developing “features” that capture different aspects of how and when listings are available. However, these features provide an incomplete picture when viewed in isolation. For example, a ~30% availability rate could indicate two very different scenarios: a host who only accepts bookings on weekends, or a host whose listing is only available during a specific season, such as summer.</p><p id="333a" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">This is where segmentation comes in.</p><blockquote class="pu"><p id="9855" class="pv pw gu bf px py pz qa qb qc qd od du">By combining multiple features, segmentation allows us to create discrete categories that represent the different availability patterns of hosts.</p></blockquote><p id="b08d" class="pw-post-body-paragraph ni nj gu nk b hs qe nm nn hv qf np nq nr qg nt nu nv qh nx ny nz qi ob oc od gn bk">But traditional segmentation methodologies, such as “<a class="af oe" href="https://en.wikipedia.org/wiki/RFM_(market_research)" rel="noopener ugc nofollow" target="_blank">RFM</a>” (Recency, Frequency, Monetary), are focused on customer value rather than calendar dynamics, and are often limited to one-off analyses on small datasets. In contrast, we need an approach that can handle calendar data and daily inference for millions of listings.</p><p id="9b17" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">To address the above challenges, this blog post explores how Airbnb used segmentation to better understand host behavior at scale. <strong class="nk gv">By enriching availability data with novel features and applying machine learning techniques, we developed a practical and scalable approach to segment availability for millions of listings daily.</strong></p><h1 id="5149" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Example: Distinguishing Hosts with Similar Profiles</h1><p id="5a65" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">Consider Alice and Max, two hosts with identical 2-bedroom apartments on Airbnb. However, Alice only lists her property in the summer, while Max has it available year-round — reflecting two distinct hosting styles.</p><p id="b360" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Alice’s seasonal availability suggests that she might live in the property most of the time, only renting it out during the summer months. Airbnb can support her with seasonal pricing tips, onboarding guides for occasional hosts, and settings suggestions.</p><p id="a86d" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Conversely, Max’s full-time availability indicates a more professional hosting style, possibly his primary income source. Airbnb can provide him with advanced booking analytics, tools for managing multiple reservations, and guidance on earnings and tax implications.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og qj"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*3OwRiPtxF8IMHd0ESTAemg.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*3OwRiPtxF8IMHd0ESTAemg.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*3OwRiPtxF8IMHd0ESTAemg.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*3OwRiPtxF8IMHd0ESTAemg.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*3OwRiPtxF8IMHd0ESTAemg.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*3OwRiPtxF8IMHd0ESTAemg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*3OwRiPtxF8IMHd0ESTAemg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*3OwRiPtxF8IMHd0ESTAemg.png 640w, https://miro.medium.com/v2/resize:fit:720/1*3OwRiPtxF8IMHd0ESTAemg.png 720w, https://miro.medium.com/v2/resize:fit:750/1*3OwRiPtxF8IMHd0ESTAemg.png 750w, https://miro.medium.com/v2/resize:fit:786/1*3OwRiPtxF8IMHd0ESTAemg.png 786w, https://miro.medium.com/v2/resize:fit:828/1*3OwRiPtxF8IMHd0ESTAemg.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*3OwRiPtxF8IMHd0ESTAemg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*3OwRiPtxF8IMHd0ESTAemg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Two Hosts with Similar Profiles (Illustrative)</figcaption></figure><p id="c25b" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk"><strong class="nk gv">How can we create a dataset that captures these crucial differences in hosting behavior?</strong></p><h1 id="3456" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Dataset</h1><h2 id="d709" class="qo ou gu bf ov qp qq dy oy qr qs ea pb nr qt qu qv nv qw qx qy nz qz ra rb rc bk">Availability Rate</h2><p id="93d4" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">A first step is to capture the host’s “intention to be available” on a specific night. Availability can be both analyzed from a backward-looking (in the past) or forward-looking (in the future) perspective. For simplicity, this post focuses on backward-looking availability, as it reflects the final state of a calendar after all changes in inventory, bookings and cancellations have occurred. Forward-looking availability is not as straightforward because changes can still happen between the analysis date and the future dates being analyzed.</p><p id="e14f" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">We consider both:</p><ul class=""><li id="828d" class="ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od rd re rf bk"><strong class="nk gv">Nights Vacant:</strong> nights when the listing was listed as available for booking on Airbnb, and remained vacant.</li><li id="4f59" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk"><strong class="nk gv">Nights Booked: </strong>nights when the listing was listed as available for booking on Airbnb, and was later booked on Airbnb.</li></ul><p id="973d" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Consequently, we can calculate the corresponding Nights Intended to be Available, or Nights Available, for the 365-day look-back period as the sum of Nights Vacant and Nights Booked. We then divide it by 365, to obtain the corresponding <strong class="nk gv">Availability Rate</strong>.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og rl"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*tC4E7RzzTyX50VPOjfMoIg.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*tC4E7RzzTyX50VPOjfMoIg.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*tC4E7RzzTyX50VPOjfMoIg.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*tC4E7RzzTyX50VPOjfMoIg.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*tC4E7RzzTyX50VPOjfMoIg.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*tC4E7RzzTyX50VPOjfMoIg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*tC4E7RzzTyX50VPOjfMoIg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*tC4E7RzzTyX50VPOjfMoIg.png 640w, https://miro.medium.com/v2/resize:fit:720/1*tC4E7RzzTyX50VPOjfMoIg.png 720w, https://miro.medium.com/v2/resize:fit:750/1*tC4E7RzzTyX50VPOjfMoIg.png 750w, https://miro.medium.com/v2/resize:fit:786/1*tC4E7RzzTyX50VPOjfMoIg.png 786w, https://miro.medium.com/v2/resize:fit:828/1*tC4E7RzzTyX50VPOjfMoIg.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*tC4E7RzzTyX50VPOjfMoIg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*tC4E7RzzTyX50VPOjfMoIg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Distribution of Listings by Availability Rate in the Previous Year (Illustrative)</figcaption></figure><p id="2162" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">From this distribution we observe:</p><ul class=""><li id="dc2b" class="ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od rd re rf bk">A considerable proportion of listings has little-to-no availability (~0% availability rate).</li><li id="3c88" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">Conversely, a significant proportion of listings has near full availability (~100% availability rate).</li><li id="f43e" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">Between these extremes, a significant set of listings emerges without strong breakpoints.</li></ul><p id="5dc8" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">How can we further differentiate these listings that fall in the middle range?</p><h2 id="b267" class="qo ou gu bf ov qp qq dy oy qr qs ea pb nr qt qu qv nv qw qx qy nz qz ra rb rc bk">Streakiness</h2><p id="0a5d" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">For listings that are not at either end of the spectrum, availability rate on its own is insufficient for capturing the nuances of how a listing is made available throughout the month. Consider listings A and B, which both have a 50% availability rate in a given month.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og rm"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*y1TNNHhfZPlc_R9y8zdBhg.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*y1TNNHhfZPlc_R9y8zdBhg.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*y1TNNHhfZPlc_R9y8zdBhg.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*y1TNNHhfZPlc_R9y8zdBhg.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*y1TNNHhfZPlc_R9y8zdBhg.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*y1TNNHhfZPlc_R9y8zdBhg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*y1TNNHhfZPlc_R9y8zdBhg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*y1TNNHhfZPlc_R9y8zdBhg.png 640w, https://miro.medium.com/v2/resize:fit:720/1*y1TNNHhfZPlc_R9y8zdBhg.png 720w, https://miro.medium.com/v2/resize:fit:750/1*y1TNNHhfZPlc_R9y8zdBhg.png 750w, https://miro.medium.com/v2/resize:fit:786/1*y1TNNHhfZPlc_R9y8zdBhg.png 786w, https://miro.medium.com/v2/resize:fit:828/1*y1TNNHhfZPlc_R9y8zdBhg.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*y1TNNHhfZPlc_R9y8zdBhg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*y1TNNHhfZPlc_R9y8zdBhg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Two Listings with Similar Availability Rates but Distinct Calendar Patterns</figcaption></figure><p id="c304" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Although these listings have distinct availability patterns, they both have the same availability rate (50%)!</p><p id="c5d4" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Listing A’s concentrated, block-like availability could lend itself to recommendations for weekly stay discounts, or advice for hosts who are away for a longer stretch — guidance which may not be suitable for Listing B.</p><p id="d707" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">To capture this distinction, we introduce “Streakiness”. In the example above, Listing A had 1 long streak of availability which was interrupted on night 16, while Listing B had 8 short streaks of availability, each lasting 2 nights before a 2-night break.</p><p id="526f" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">We define a streak as a consecutive sequence of availability with a minimum of 2 consecutive nights, followed by a subsequent period of at least 2 consecutive nights of unavailability, as described in the diagram below. Note that we initially considered using a single night of availability/unavailability as a threshold but found it to be a less reliable signal of the consistency that streakiness aims to measure.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og qj"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*lOXpLC6TyG_HEhwjg4hEfA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*lOXpLC6TyG_HEhwjg4hEfA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*lOXpLC6TyG_HEhwjg4hEfA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*lOXpLC6TyG_HEhwjg4hEfA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*lOXpLC6TyG_HEhwjg4hEfA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*lOXpLC6TyG_HEhwjg4hEfA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*lOXpLC6TyG_HEhwjg4hEfA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*lOXpLC6TyG_HEhwjg4hEfA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*lOXpLC6TyG_HEhwjg4hEfA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*lOXpLC6TyG_HEhwjg4hEfA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*lOXpLC6TyG_HEhwjg4hEfA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*lOXpLC6TyG_HEhwjg4hEfA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*lOXpLC6TyG_HEhwjg4hEfA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*lOXpLC6TyG_HEhwjg4hEfA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Streak Definition</figcaption></figure><p id="67b8" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">This leads us to the corresponding <strong class="nk gv">Streakiness</strong> feature, computed as the ratio of Streaks divided by the number of Nights Available (computed in the previous section). At this point, we now have two relatively orthogonal features for our analysis: availability rate and streakiness.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og rn"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 640w, https://miro.medium.com/v2/resize:fit:720/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 720w, https://miro.medium.com/v2/resize:fit:750/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 750w, https://miro.medium.com/v2/resize:fit:786/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 786w, https://miro.medium.com/v2/resize:fit:828/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*Qr4Z7Kl8FLuw4YMzCqYf3A.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Combining Availability + Streakiness</figcaption></figure><h2 id="7d9b" class="qo ou gu bf ov qp qq dy oy qr qs ea pb nr qt qu qv nv qw qx qy nz qz ra rb rc bk">Seasonality</h2><p id="60f4" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">We found that while availability and streakiness provide a solid basis for measuring volume and consistency, they don’t capture a calendar’s “compactness” — in other words, its <strong class="nk gv">seasonality</strong>. As an example, consider Listings C and D, which both have around 15% availability and 14 streaks:</p><ul class=""><li id="af86" class="ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od rd re rf bk">Listing C concentrates its availability within a narrower block of time (summer season) — see first calendar below.</li><li id="e4d2" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">Listing D distributes its availability more evenly across multiple quarters — see second calendar below.</li></ul><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og ro"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*s0eOYP6Nzlmb0fNJLwoNSA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*s0eOYP6Nzlmb0fNJLwoNSA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*s0eOYP6Nzlmb0fNJLwoNSA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*s0eOYP6Nzlmb0fNJLwoNSA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*s0eOYP6Nzlmb0fNJLwoNSA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*s0eOYP6Nzlmb0fNJLwoNSA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*s0eOYP6Nzlmb0fNJLwoNSA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*s0eOYP6Nzlmb0fNJLwoNSA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*s0eOYP6Nzlmb0fNJLwoNSA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*s0eOYP6Nzlmb0fNJLwoNSA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*s0eOYP6Nzlmb0fNJLwoNSA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*s0eOYP6Nzlmb0fNJLwoNSA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*s0eOYP6Nzlmb0fNJLwoNSA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*s0eOYP6Nzlmb0fNJLwoNSA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Two Listings with Similar Availability Rates / Streakiness but Distinct Calendar Patterns</figcaption></figure><p id="d7bd" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Seasonality plays a crucial role in Airbnb’s business, as guest demand and host availability fluctuate with changes in seasonal appeal, holidays, and local events. Given this, we propose to create a <strong class="nk gv">Quarters with at Least One Night of Availability</strong> feature.</p><p id="4c10" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Additionally, we create a <strong class="nk gv">Maximum Consecutive Months</strong> feature which captures streakiness at a yearly scale, highlighting the longest continuous period a listing is available. Together, these features give clearer insight into seasonal patterns.</p><h2 id="f8e0" class="qo ou gu bf ov qp qq dy oy qr qs ea pb nr qt qu qv nv qw qx qy nz qz ra rb rc bk">Final dataset</h2><p id="eb31" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">The final feature set includes all listings that were listed on the platform as of a broad set of dates. For each listing, we calculate the features we’ve designed in the previous sections. Then, we take a large, random sample across these dates. Finally, we scale the numerical features to ensure they are on a comparable scale.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og rp"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*3s1H1osFkh-GNQpHgBB5wA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*3s1H1osFkh-GNQpHgBB5wA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*3s1H1osFkh-GNQpHgBB5wA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*3s1H1osFkh-GNQpHgBB5wA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*3s1H1osFkh-GNQpHgBB5wA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*3s1H1osFkh-GNQpHgBB5wA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*3s1H1osFkh-GNQpHgBB5wA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*3s1H1osFkh-GNQpHgBB5wA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*3s1H1osFkh-GNQpHgBB5wA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*3s1H1osFkh-GNQpHgBB5wA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*3s1H1osFkh-GNQpHgBB5wA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*3s1H1osFkh-GNQpHgBB5wA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*3s1H1osFkh-GNQpHgBB5wA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*3s1H1osFkh-GNQpHgBB5wA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Sample Listings Depicting our Feature Set</figcaption></figure><h1 id="9821" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Segmentation Model</h1><p id="8fb9" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">We can now apply a <a class="af oe" href="https://en.wikipedia.org/wiki/K-means_clustering" rel="noopener ugc nofollow" target="_blank"><strong class="nk gv">K-means clustering algorithm</strong></a> to identify segments, testing models with K values from 2 to 10. Using the <a class="af oe" href="https://en.wikipedia.org/wiki/Elbow_method_(clustering)" rel="noopener ugc nofollow" target="_blank">elbow plot</a> to find the optimal number of clusters, we select 8 clusters as the best representation of our data.</p><p id="343c" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">We now have our clusters, but they don’t have names yet. Our cluster naming process involves several steps:</p><ul class=""><li id="b445" class="ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od rd re rf bk">Checking the distribution of each feature by cluster to identify strong differences (e.g., “cluster 1 has the highest availability rate”)</li><li id="9d2d" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">Randomly sampling listings from each cluster and visualizing their calendars</li><li id="8580" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">Iterating on naming with a cross-functional internal working group</li></ul><p id="9868" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">The output of this process is summarized in the table below, while the following diagram displays a “typical” calendar for each cluster.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og rq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*yayTaaAXJaANqK3ORFvl9A.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*yayTaaAXJaANqK3ORFvl9A.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*yayTaaAXJaANqK3ORFvl9A.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*yayTaaAXJaANqK3ORFvl9A.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*yayTaaAXJaANqK3ORFvl9A.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*yayTaaAXJaANqK3ORFvl9A.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*yayTaaAXJaANqK3ORFvl9A.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*yayTaaAXJaANqK3ORFvl9A.png 640w, https://miro.medium.com/v2/resize:fit:720/1*yayTaaAXJaANqK3ORFvl9A.png 720w, https://miro.medium.com/v2/resize:fit:750/1*yayTaaAXJaANqK3ORFvl9A.png 750w, https://miro.medium.com/v2/resize:fit:786/1*yayTaaAXJaANqK3ORFvl9A.png 786w, https://miro.medium.com/v2/resize:fit:828/1*yayTaaAXJaANqK3ORFvl9A.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*yayTaaAXJaANqK3ORFvl9A.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*yayTaaAXJaANqK3ORFvl9A.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">From Cluster Intuition to Cluster Name</figcaption></figure><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og rr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*gudh_qoE2suaIibj7O2K9Q.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*gudh_qoE2suaIibj7O2K9Q.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*gudh_qoE2suaIibj7O2K9Q.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*gudh_qoE2suaIibj7O2K9Q.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*gudh_qoE2suaIibj7O2K9Q.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*gudh_qoE2suaIibj7O2K9Q.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*gudh_qoE2suaIibj7O2K9Q.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*gudh_qoE2suaIibj7O2K9Q.png 640w, https://miro.medium.com/v2/resize:fit:720/1*gudh_qoE2suaIibj7O2K9Q.png 720w, https://miro.medium.com/v2/resize:fit:750/1*gudh_qoE2suaIibj7O2K9Q.png 750w, https://miro.medium.com/v2/resize:fit:786/1*gudh_qoE2suaIibj7O2K9Q.png 786w, https://miro.medium.com/v2/resize:fit:828/1*gudh_qoE2suaIibj7O2K9Q.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*gudh_qoE2suaIibj7O2K9Q.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*gudh_qoE2suaIibj7O2K9Q.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Examples of Calendars by Cluster</figcaption></figure><h1 id="006e" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Segment Validation</h1><p id="eb7b" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk"><strong class="nk gv">Since we are measuring a latent attribute </strong>— underlying host behavior patterns that don’t have “ground truth” labels — <strong class="nk gv">there is no perfectly accurate way to validate our segmentation</strong>. However, we can use various methodologies to ensure that it “makes sense” from a business perspective, and reliably reflects real-life host behaviors.</p><p id="9d02" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">We do so in three steps:</p><ul class=""><li id="f510" class="ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od rd re rf bk">A/B Testing</li><li id="254a" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">Correlates of Availability Segments</li><li id="c380" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">User Experience (UX) Research</li></ul><h2 id="48d6" class="qo ou gu bf ov qp qq dy oy qr qs ea pb nr qt qu qv nv qw qx qy nz qz ra rb rc bk">A/B Testing</h2><p id="11d4" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">In an A/B test, we assessed how the different segments previously used a feature that encouraged hosts to complete “recommended actions” (e.g., letting guests book their home last-minute) so they may earn a monetary incentive.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og rs"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*hwUsT0koNgogJ_gEsAIppw.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*hwUsT0koNgogJ_gEsAIppw.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*hwUsT0koNgogJ_gEsAIppw.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*hwUsT0koNgogJ_gEsAIppw.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*hwUsT0koNgogJ_gEsAIppw.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*hwUsT0koNgogJ_gEsAIppw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*hwUsT0koNgogJ_gEsAIppw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*hwUsT0koNgogJ_gEsAIppw.png 640w, https://miro.medium.com/v2/resize:fit:720/1*hwUsT0koNgogJ_gEsAIppw.png 720w, https://miro.medium.com/v2/resize:fit:750/1*hwUsT0koNgogJ_gEsAIppw.png 750w, https://miro.medium.com/v2/resize:fit:786/1*hwUsT0koNgogJ_gEsAIppw.png 786w, https://miro.medium.com/v2/resize:fit:828/1*hwUsT0koNgogJ_gEsAIppw.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*hwUsT0koNgogJ_gEsAIppw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*hwUsT0koNgogJ_gEsAIppw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Example of Host-Facing Recommended Actions</figcaption></figure><p id="496c" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">We show the use of the feature by each segment below. These results align with our intuition: hosts who use Airbnb for specific occasions or rarely may not be interested in following recommendations, even when incentivized. Similarly, “Always On” hosts, who are already highly engaged and proactive in managing their listings, might prefer to rely on their own strategies rather than follow Airbnb’s suggestions. Hosts who fall somewhere in between, with moderate levels of engagement, may be the ideal target for incentives, as they are likely open to adjustments that could boost their performance.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og rt"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*NuJpZ4btpXn23btHI8M2RQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*NuJpZ4btpXn23btHI8M2RQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*NuJpZ4btpXn23btHI8M2RQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*NuJpZ4btpXn23btHI8M2RQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*NuJpZ4btpXn23btHI8M2RQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*NuJpZ4btpXn23btHI8M2RQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*NuJpZ4btpXn23btHI8M2RQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*NuJpZ4btpXn23btHI8M2RQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*NuJpZ4btpXn23btHI8M2RQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*NuJpZ4btpXn23btHI8M2RQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*NuJpZ4btpXn23btHI8M2RQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*NuJpZ4btpXn23btHI8M2RQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*NuJpZ4btpXn23btHI8M2RQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*NuJpZ4btpXn23btHI8M2RQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Example of Heterogeneous Treatment Effects by Availability Segment<br />(“CI” = Confidence Interval)</figcaption></figure><h2 id="6f5e" class="qo ou gu bf ov qp qq dy oy qr qs ea pb nr qt qu qv nv qw qx qy nz qz ra rb rc bk">Correlates of Availability Segments</h2><p id="0c9f" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">We also validate our clusters by checking correlations with known attributes. For instance, we confirm that “Always On” listings are likely more managed by professionals, or that “Short Seasonal” listings are likely more common in ski or beach destinations.</p><p id="c992" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Furthermore, we know it is common to observe an increase in the number of listings around big events. As expected, we observe a rise in “Event Motivated” listings leading up to and during major events periods, reflecting hosts’ responsiveness to increased demand.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og ru"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*QHD7knLqv1-BgqwEkp_6cQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*QHD7knLqv1-BgqwEkp_6cQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*QHD7knLqv1-BgqwEkp_6cQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*QHD7knLqv1-BgqwEkp_6cQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*QHD7knLqv1-BgqwEkp_6cQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*QHD7knLqv1-BgqwEkp_6cQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*QHD7knLqv1-BgqwEkp_6cQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*QHD7knLqv1-BgqwEkp_6cQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*QHD7knLqv1-BgqwEkp_6cQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*QHD7knLqv1-BgqwEkp_6cQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*QHD7knLqv1-BgqwEkp_6cQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*QHD7knLqv1-BgqwEkp_6cQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*QHD7knLqv1-BgqwEkp_6cQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*QHD7knLqv1-BgqwEkp_6cQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Impact of an Event on the % of Event-Motivated Listings (Illustrative)</figcaption></figure><h2 id="c4bb" class="qo ou gu bf ov qp qq dy oy qr qs ea pb nr qt qu qv nv qw qx qy nz qz ra rb rc bk">UX Research</h2><p id="5edd" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">Finally, we know the UX Research team conducts host surveys to create qualitative personas, which we compare against our clusters to ensure they align with real-world behavior. For instance, we verify if segments with high weekend availability match hosts who self-report preferring weekend rentals.</p><h1 id="ef1f" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Scaling and Productionization</h1><p id="e2c2" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">Now, we need to scale this segmentation to all our listings.</p><p id="1974" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">To achieve this, we use a<a class="af oe" href="https://en.wikipedia.org/wiki/Decision_tree_learning" rel="noopener ugc nofollow" target="_blank"><strong class="nk gv">decision tree algorithm</strong></a>. We train a model using our 4 features, with cluster labels from our K-means model as outputs. We also perform a train-test split to make sure the model accurately predicts each cluster.</p><p id="bcf8" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">This new model provides a simple, interpretable set of if-else rules to classify listings into clusters. <strong class="nk gv">Using the decision tree structure, we translate the model’s logic into a SQL query by converting the decision tree’s “IF” conditions into “CASE WHEN” statements</strong>. This integration enables the model to be propagated in our data warehouse.</p><figure class="oi oj ok ol om on of og paragraph-image"><div role="button" tabindex="0" class="oo op fj oq bh or"><div class="of og rv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*vl5fzKGg253W85-P5pqTfA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*vl5fzKGg253W85-P5pqTfA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*vl5fzKGg253W85-P5pqTfA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*vl5fzKGg253W85-P5pqTfA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*vl5fzKGg253W85-P5pqTfA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*vl5fzKGg253W85-P5pqTfA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*vl5fzKGg253W85-P5pqTfA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*vl5fzKGg253W85-P5pqTfA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*vl5fzKGg253W85-P5pqTfA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*vl5fzKGg253W85-P5pqTfA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*vl5fzKGg253W85-P5pqTfA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*vl5fzKGg253W85-P5pqTfA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*vl5fzKGg253W85-P5pqTfA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*vl5fzKGg253W85-P5pqTfA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qk ff ql of og qm qn bf b bg z du">Decision Tree Structure</figcaption></figure><h1 id="a13f" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Applications at Airbnb and Beyond</h1><p id="3a75" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">At Airbnb, various teams leverage these segments: product teams to inform strategy and analyze heterogeneous treatment effects in A/B tests, marketing teams for targeted messaging, and UX research teams for insights into hosts’ motivations.</p><p id="5c5a" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">For instance, we revealed an opportunity to boost <a class="af oe" href="https://www.airbnb.com/help/article/523" rel="noopener ugc nofollow" target="_blank">Instant Book</a> adoption among “Event Motivated” hosts, who may occasionally list their primary residence and prefer manual guest screening. Adding an option for hosts to only accept guests with a certain rating may make Instant Book more appealing to them, offering a balance between host control and booking efficiency.</p><p id="ffce" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk">Initially designed for listing availability data, this segmentation methodology has also been adapted to host activity data. We developed a second segmentation focused on days with “host engagement” (e.g., adjusting prices, updating policies, revising listing descriptions) to differentiate occasional “Settings Tinkerers” from frequent “Settings Optimizers.”</p><p id="39b0" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk"><strong class="nk gv">This approach can also be adapted to other industries where understanding temporal engagement is essential</strong>, for instance, to distinguish:</p><ul class=""><li id="3f7a" class="ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od rd re rf bk">Social Media: casual lurkers vs. active content creators</li><li id="8d1d" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">Ridesharing: occasional drivers during peak demand vs. full-time drivers</li><li id="b8fa" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">Streaming Services: nighttime streamers vs. continuous streamers</li><li id="0c7a" class="ni nj gu nk b hs rg nm nn hv rh np nq nr ri nt nu nv rj nx ny nz rk ob oc od rd re rf bk">E-commerce: sales/holidays enthusiasts vs. year-round shoppers</li></ul><h1 id="06a6" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Acknowledgments</h1><p id="1d69" class="pw-post-body-paragraph ni nj gu nk b hs pp nm nn hv pq np nq nr pr nt nu nv ps nx ny nz pt ob oc od gn bk">This blog post was a collaborative effort, with significant contributions from Tim Abraham, the main co-author. We’d also like to acknowledge the invaluable support of team members from multiple organizations, including (but not limited to) Regina Wu, Maggie Jarley, and Peter Coles.</p></div></div></div><div class="ab cb rw rx ry rz" role="separator"><div class="gn go gp gq gr"><div class="ab cb"><div class="ci bh fz ga gb gc"><p id="b81c" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk"><em class="se">Does this type of work interest you? We’re </em><a class="af oe" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"><em class="se">hiring</em></a><em class="se">!</em></p><p id="1410" class="pw-post-body-paragraph ni nj gu nk b hs nl nm nn hv no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gn bk"><em class="se">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/from-data-to-insights-segmenting-airbnbs-supply-c88aa2bb9399</link>
      <guid>https://medium.com/airbnb-engineering/from-data-to-insights-segmenting-airbnbs-supply-c88aa2bb9399</guid>
      <pubDate>Mon, 25 Nov 2024 19:02:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Building a User Signals Platform at Airbnb]]></title>
      <description><![CDATA[<div class="gn go gp gq gr"><div class="ab cb"><div class="ci bh fz ga gb gc"><div><div></div></div></div></div><div class="ab cb mw mx my mz" role="separator"><div class="gn go gp gq gr"><div class="ab cb"><div class="ci bh fz ga gb gc"><p id="4ec5" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">How Airbnb built a stream processing platform to power user personalization.</p><p id="bd3d" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk"><strong class="ng gv">By:</strong> Kidai Kwon, Pavan Tambay, Xinrui Hua, Soumyadip (Soumo) Banerjee, Phanindra (Phani) Ganti</p><figure class="of og oh oi oj ok oc od paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="oc od oe"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*ZDusO7LglpaC7sF7 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*ZDusO7LglpaC7sF7 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*ZDusO7LglpaC7sF7 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*ZDusO7LglpaC7sF7 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*ZDusO7LglpaC7sF7 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*ZDusO7LglpaC7sF7 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*ZDusO7LglpaC7sF7 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*ZDusO7LglpaC7sF7 640w, https://miro.medium.com/v2/resize:fit:720/0*ZDusO7LglpaC7sF7 720w, https://miro.medium.com/v2/resize:fit:750/0*ZDusO7LglpaC7sF7 750w, https://miro.medium.com/v2/resize:fit:786/0*ZDusO7LglpaC7sF7 786w, https://miro.medium.com/v2/resize:fit:828/0*ZDusO7LglpaC7sF7 828w, https://miro.medium.com/v2/resize:fit:1100/0*ZDusO7LglpaC7sF7 1100w, https://miro.medium.com/v2/resize:fit:1400/0*ZDusO7LglpaC7sF7 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="7135" class="oq or gu bf os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn bk">Overview</h1><p id="700b" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">Understanding user actions is critical for delivering a more personalized product experience. In this blog, we will explore how Airbnb developed a large-scale, near real-time stream processing platform for capturing and understanding user actions, which enables multiple teams to easily leverage real-time user activities. Additionally, we will discuss the challenges encountered and valuable insights gained from operating a large-scale stream processing platform.</p><h1 id="19d0" class="oq or gu bf os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn bk">Background</h1><p id="add7" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">Airbnb connects millions of guests with unique homes and experiences worldwide. To help guests make the best travel decisions, providing personalized experiences throughout the booking process is essential. Guests may move through various stages — browsing destinations, planning trips, wishlisting, comparing listings, and finally booking. At each stage, Airbnb can enhance the guest experience through tailored interactions, both within the app and through notifications.</p><p id="164f" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">This personalization can range from understanding recent user activities, like searches and viewed homes, to segmenting users based on their trip intent and stage. A robust infrastructure is essential for processing extensive user engagement data and delivering insights in near real-time. Additionally, it’s important to platformize the infrastructure so that other teams can contribute to deriving user insights, especially since many engineering teams are not familiar with stream processing.</p><p id="ca4a" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">Airbnb’s User Signals Platform (USP) is designed to leverage user engagement data to provide personalized product experiences with many goals:</p><ul class=""><li id="70f3" class="ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob pt pu pv bk">Ability to store both real-time and historic data about users’ engagement across the site.</li><li id="190f" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk">Ability to query data for both online use cases and offline data analyses.</li><li id="f83a" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk">Ability to support online serving use cases with real-time data, with an end-to-end streaming latency of less than 1 second.</li><li id="f7e2" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk">Ability to support asynchronous computations to derive user understanding data, such as user segments and session engagement.</li><li id="3dc7" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk">Ability to allow various teams to easily define pipelines to capture user activities.</li></ul><h1 id="2cb3" class="oq or gu bf os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn bk">USP System Architecture</h1><p id="16a2" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">USP consists of a data pipeline layer and an online serving layer. The data pipeline layer is based on the <a class="af qb" href="https://en.wikipedia.org/wiki/Lambda_architecture" rel="noopener ugc nofollow" target="_blank">Lambda architecture</a> with an online streaming component that processes <a class="af qb" href="https://kafka.apache.org/" rel="noopener ugc nofollow" target="_blank">Kafka</a> events near real-time and an offline component for data correction and backfill. The online serving layer performs read time operations by querying the <a class="af qb" href="https://en.wikipedia.org/wiki/Key%E2%80%93value_database" rel="noopener ugc nofollow" target="_blank">Key Value</a> (KV) store, written at the data pipeline layer. At a high-level, the below diagram demonstrates the lifecycle of user events produced by Airbnb applications that are transformed via <a class="af qb" href="https://flink.apache.org/" rel="noopener ugc nofollow" target="_blank">Flink</a>, stored in the KV store, then served via the service layer:</p><figure class="of og oh oi oj ok oc od paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="oc od qc"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*1zHh1jsXPpJ6MTKm 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*1zHh1jsXPpJ6MTKm 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*1zHh1jsXPpJ6MTKm 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*1zHh1jsXPpJ6MTKm 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*1zHh1jsXPpJ6MTKm 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*1zHh1jsXPpJ6MTKm 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*1zHh1jsXPpJ6MTKm 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*1zHh1jsXPpJ6MTKm 640w, https://miro.medium.com/v2/resize:fit:720/0*1zHh1jsXPpJ6MTKm 720w, https://miro.medium.com/v2/resize:fit:750/0*1zHh1jsXPpJ6MTKm 750w, https://miro.medium.com/v2/resize:fit:786/0*1zHh1jsXPpJ6MTKm 786w, https://miro.medium.com/v2/resize:fit:828/0*1zHh1jsXPpJ6MTKm 828w, https://miro.medium.com/v2/resize:fit:1100/0*1zHh1jsXPpJ6MTKm 1100w, https://miro.medium.com/v2/resize:fit:1400/0*1zHh1jsXPpJ6MTKm 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qd ff qe oc od qf qg bf b bg z du"><strong class="bf os">Figure 1. USP System Architecture Overview</strong></figcaption></figure><p id="441e" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">Key design choices that were made:</p><ul class=""><li id="271c" class="ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob pt pu pv bk">We chose <a class="af qb" href="https://flink.apache.org/" rel="noopener ugc nofollow" target="_blank">Flink</a> streaming over <a class="af qb" href="https://spark.apache.org/" rel="noopener ugc nofollow" target="_blank">Spark</a> streaming because we previously experienced event delays with Spark due to the difference between micro-batch streaming (Spark streaming), which processes data streams as a series of small batch jobs, and event-based streaming (Flink), which processes event by event.</li><li id="e822" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk">We decided to store transformed data in an append-only manner in the KV store with the event processing timestamp as a version. This greatly reduces complexity because with at-least once processing, it guarantees idempotency even if the same events are processed multiple times via stream processing or batch processing.</li><li id="d2d3" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk">We used a config based developer workflow to generate job templates and allow developers to define transforms, which are shared between Flink and batch jobs in order to make the USP developer friendly, especially to other teams that are not familiar with Flink operations.</li></ul><h1 id="56f4" class="oq or gu bf os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn bk">USP Capabilities</h1><p id="33a9" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">USP supports several types of user event processing based on the above streaming architecture. The diagram below is a detailed view of various user event processing flows within USP. Source Kafka events from user activities are first transformed into User Signals, which are written to the KV store for querying purposes and also emitted as Kafka events. These transform Kafka events are consumed by user understanding jobs (such as User Segments, Session Engagements) to trigger asynchronous computations. The USP service layer handles online query requests by querying the KV store and performing any other query time operations.</p><figure class="of og oh oi oj ok oc od paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="oc od qh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*MgmfxGTHkspd_Npc 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*MgmfxGTHkspd_Npc 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*MgmfxGTHkspd_Npc 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*MgmfxGTHkspd_Npc 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*MgmfxGTHkspd_Npc 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*MgmfxGTHkspd_Npc 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*MgmfxGTHkspd_Npc 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*MgmfxGTHkspd_Npc 640w, https://miro.medium.com/v2/resize:fit:720/0*MgmfxGTHkspd_Npc 720w, https://miro.medium.com/v2/resize:fit:750/0*MgmfxGTHkspd_Npc 750w, https://miro.medium.com/v2/resize:fit:786/0*MgmfxGTHkspd_Npc 786w, https://miro.medium.com/v2/resize:fit:828/0*MgmfxGTHkspd_Npc 828w, https://miro.medium.com/v2/resize:fit:1100/0*MgmfxGTHkspd_Npc 1100w, https://miro.medium.com/v2/resize:fit:1400/0*MgmfxGTHkspd_Npc 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qd ff qe oc od qf qg bf b bg z du"><strong class="bf os">Figure 2. USP Capabilities Flow</strong></figcaption></figure><h2 id="c630" class="qi or gu bf os qj qk dy ow ql qm ea pa np qn qo qp nt qq qr qs nx qt qu qv qw bk">User Signals</h2><p id="da0c" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">User signals correspond to a list of recent user activities that are queryable by signal type, start time, and end time. Searches, home views, and bookings are example signal types. When creating a new User Signal, the developer defines a config that specifies the source Kafka event and the transform class. Below is an example User Signal definition with a config and a user-defined transform class.</p><pre class="of og oh oi oj qx qy qz bp ra bb bk">- name: example_signal<br />  type: simple<br />  signal_class: com.airbnb.usp.api.ExampleSignal<br />  event_sources:<br />  - kafka_topic: example_source_event<br />    transform: com.airbnb.usp.transforms.ExampleSignalTransform</pre><pre class="rg qx qy qz bp ra bb bk">public class ExampleSignalTransform extends AbstractSignalTransform {<br />  @Override<br />  public boolean isValidEvent(ExampleSourceEvent event) {<br />  }@Override<br />  public ExampleSignal transform(ExampleSourceEvent event) {<br />  }<br />}</pre><p id="a899" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">Developers can also specify a join signal, which allows joining multiple source Kafka events with a specified join key near real-time via stateful streaming with RocksDB as a state store.</p><pre class="of og oh oi oj qx qy qz bp ra bb bk">- name: example_join_signal<br />  type: left_join<br />  signal_class: com.airbnb.usp.api.ExampleJoinSignal<br />  transform: com.airbnb.usp.transforms.ExampleJoinSignalTransform<br />  left_event_source:<br />    kafka_topic: example_left_source_event<br />    join_key_field: example_join_key<br />  right_event_source:<br />    kafka_topic: example_right_source_event<br />    join_key_field: example_join_key</pre><p id="24e5" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">Once the config and the transform class are defined for a signal, developers run a script to auto-generate Flink configurations, backfill batch files, and alert files like below:</p><pre class="of og oh oi oj qx qy qz bp ra bb bk">$ python3 setup_signal.py --signal example_signalGenerates:# Flink configuration related<br />[1] ../flink/signals/flink-jobs.yaml<br />[2] ../flink/signals/example_signal-streaming.conf# Backfill related files<br />[3] ../batch/example_signal-batch.py# Alerts related files<br />[4] ../alerts/example_signal-events_written_anomaly.yaml<br />[5] ../alerts/example_signal-overall_latency_high.yaml<br />[6] ../alerts/example_signal-overall_success_rate_low.yaml</pre><h2 id="45a8" class="qi or gu bf os qj qk dy ow ql qm ea pa np qn qo qp nt qq qr qs nx qt qu qv qw bk">User Segments</h2><p id="89b9" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">User Segments provide the ability to define user cohorts near real-time with different triggering criteria for compute and various start and expiration conditions. The user-defined transform exposes several abstract methods which developers can simply implement the business logic without having to worry about streaming components.</p><p id="eb4a" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">For example, the active trip planner is a User Segment that assigns guests into the segment as soon as the guest performs a search and removes the guests from the segment after 14 days of inactivity or once the guest makes a booking. Below are abstract methods that the developer will implement to create the active trip planner User Segment:</p><ul class=""><li id="3334" class="ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob pt pu pv bk"><strong class="ng gv">inSegment</strong>: Given the triggered User Signals, check if the given user is in the segment.</li><li id="1c03" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk"><strong class="ng gv">getStartTimestamp</strong>: Define the start time when the given user will be in the segment. For example, when the user starts a search on Airbnb, the start time will be set to the search timestamp and the user will be immediately placed in this user segment.</li><li id="8b4b" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk"><strong class="ng gv">getExpirationTimestamp</strong>: Define the end time when the given user will be out of the segment. For example, when the user performs a search, the user will be in the segment for the next 14 days until the next triggering User Signal arrives, then the expiration time will be updated accordingly.</li></ul><pre class="of og oh oi oj qx qy qz bp ra bb bk">public class ExampleSegmentTransform extends AbstractSegmentTransform {<br />  @Override<br />  protected boolean inSegment(List&lt;Signal&gt; inputSignals) {<br />  }@Override<br />  public Instant getStartTimestamp(List&lt;Signal&gt; inputSignals) {<br />  }@Override<br />  public Instant getExpirationTimestamp(List&lt;Signal&gt; inputSignals) {<br />  }<br />}</pre><h2 id="5dc1" class="qi or gu bf os qj qk dy ow ql qm ea pa np qn qo qp nt qq qr qs nx qt qu qv qw bk">Session Engagements</h2><p id="8f55" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">The session engagement Flink job enables developers to group and analyze a series of short-term user actions, known as session engagements, to gain insights into holistic user behavior within a specific timeframe. For example, understanding the photos of homes the guest viewed in the current session would be useful to derive the guest preference for the upcoming trip.</p><p id="3e11" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">As transform Kafka events from User Signals get ingested, the job splits the stream into keyed streams by user id as a key to allow the computation to be performed in parallel.</p><p id="80ae" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">The job employs various windowing techniques, such as sliding windows and session windows, to trigger computations based on aggregated user actions within these windows. Sliding windows continuously advance by a specified time interval, while session windows dynamically adjust based on user activity patterns. For example, as a user browses multiple listings on the Airbnb app, a sliding window of size 10 minutes that slides every 5 minutes is used to analyze the user’s short term engagement to generate the user’s short term trip preference.</p><p id="d233" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">The asynchronous compute pattern empowers developers to execute resource intensive operations, such as running ML models or making service calls, without disrupting the real-time processing pipeline. This approach ensures that computed user understanding data is efficiently stored and readily available for rapid querying from the KV store.</p><figure class="of og oh oi oj ok oc od paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="oc od rh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*r7kzHxNg10NYsMAt 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*r7kzHxNg10NYsMAt 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*r7kzHxNg10NYsMAt 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*r7kzHxNg10NYsMAt 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*r7kzHxNg10NYsMAt 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*r7kzHxNg10NYsMAt 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*r7kzHxNg10NYsMAt 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*r7kzHxNg10NYsMAt 640w, https://miro.medium.com/v2/resize:fit:720/0*r7kzHxNg10NYsMAt 720w, https://miro.medium.com/v2/resize:fit:750/0*r7kzHxNg10NYsMAt 750w, https://miro.medium.com/v2/resize:fit:786/0*r7kzHxNg10NYsMAt 786w, https://miro.medium.com/v2/resize:fit:828/0*r7kzHxNg10NYsMAt 828w, https://miro.medium.com/v2/resize:fit:1100/0*r7kzHxNg10NYsMAt 1100w, https://miro.medium.com/v2/resize:fit:1400/0*r7kzHxNg10NYsMAt 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qd ff qe oc od qf qg bf b bg z du"><strong class="bf os">Figure 3. Session Engagements Flow</strong></figcaption></figure><h1 id="2482" class="oq or gu bf os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn bk">Flink Operations</h1><p id="8029" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">USP is a stream processing platform built for developers. Below are some of the learnings from operating hundreds of Flink jobs.</p><h2 id="f5c2" class="qi or gu bf os qj qk dy ow ql qm ea pa np qn qo qp nt qq qr qs nx qt qu qv qw bk">Metrics</h2><p id="06e5" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">We use various latency metrics to measure the performance of streaming jobs.</p><ul class=""><li id="677c" class="ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob pt pu pv bk"><strong class="ng gv">Event Latency</strong>: From when the user events are generated from applications to when the transformed events are written to the KV store.</li><li id="0a12" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk"><strong class="ng gv">Ingestion Latency</strong>: From when the user events arrive at the Kafka cluster to when the transformed events are written to the KV store.</li><li id="067f" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk"><strong class="ng gv">Job Latency</strong>: From when the Flink job starts processing source Kafka events to when the transformed events are written to the KV store.</li><li id="b93d" class="ne nf gu ng b nh pw nj nk nl px nn no np py nr ns nt pz nv nw nx qa nz oa ob pt pu pv bk"><strong class="ng gv">Transform Latency</strong>: From when the Flink job starts processing source Kafka events to when the Flink job finishes the transformation.</li></ul><figure class="of og oh oi oj ok oc od paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="oc od ri"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*10I3qsGy0L0HPL7d 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*10I3qsGy0L0HPL7d 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*10I3qsGy0L0HPL7d 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*10I3qsGy0L0HPL7d 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*10I3qsGy0L0HPL7d 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*10I3qsGy0L0HPL7d 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*10I3qsGy0L0HPL7d 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*10I3qsGy0L0HPL7d 640w, https://miro.medium.com/v2/resize:fit:720/0*10I3qsGy0L0HPL7d 720w, https://miro.medium.com/v2/resize:fit:750/0*10I3qsGy0L0HPL7d 750w, https://miro.medium.com/v2/resize:fit:786/0*10I3qsGy0L0HPL7d 786w, https://miro.medium.com/v2/resize:fit:828/0*10I3qsGy0L0HPL7d 828w, https://miro.medium.com/v2/resize:fit:1100/0*10I3qsGy0L0HPL7d 1100w, https://miro.medium.com/v2/resize:fit:1400/0*10I3qsGy0L0HPL7d 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qd ff qe oc od qf qg bf b bg z du"><strong class="bf os">Figure 4. Flink Job Metrics</strong></figcaption></figure><p id="4503" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">Event Latency is the end-to-end latency measuring when the generated user action becomes queryable. This metric can be difficult to control because if the Flink job relies on client side events, the events themselves may not be readily ingestible due to the slow network on the client device or the batching of the logs on the client device for performance. With these reasons, it’s also preferable to rely on server side events over client side events for the source user events, only if the comparables are available.</p><p id="651c" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">Ingestion Latency is the main metric we monitor. This also covers various issues that can happen in different stages such as overloaded Kafka topics and latency issues when writing to the KV store (from client pool issues, rate limits, service instability).</p><h2 id="cf6f" class="qi or gu bf os qj qk dy ow ql qm ea pa np qn qo qp nt qq qr qs nx qt qu qv qw bk">Improving Flink Job stability with standby Task Managers</h2><p id="dc7c" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">Flink is a distributed system that runs on a single Job Manager that orchestrates tasks in different Task Managers that act as actual workers. When a Flink job is ingesting a Kafka topic, different partitions of the Kafka topic are assigned to different Task Managers. If one Task Manager fails, incoming Kafka events from the partitions assigned to that task manager will be blocked until a new replacement task manager is created. Unlike the online service horizontal scaling where pods can be simply replaced with traffic rebalancing, Flink assigns fixed partitions of input Kafka topics to Task Managers without auto reassignment. This creates large backlogs of events from those Kafka partitions from the failed Task Manager, while other Task Managers are still processing events from other partitions.</p><p id="d33f" class="pw-post-body-paragraph ne nf gu ng b nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gn bk">In order to reduce this downtime, we provision extra hot-standby pods. In the diagram below, on the left side, the job is running at a stable state with four Task Managers with one Task Manager (Task Manager 5) as a hot-standby. On the right side, in case of the Task Manager 4 failure, the standby Task Manager 5 immediately starts processing tasks for the terminated pod, instead of waiting for the new pod to spin up. Eventually another standby pod will be created. In this way, we can achieve better stability with a small cost of having standby pods.</p><figure class="of og oh oi oj ok oc od paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="oc od oe"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*xhxfJ_KyoOuZ06Dv 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*xhxfJ_KyoOuZ06Dv 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*xhxfJ_KyoOuZ06Dv 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*xhxfJ_KyoOuZ06Dv 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*xhxfJ_KyoOuZ06Dv 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*xhxfJ_KyoOuZ06Dv 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*xhxfJ_KyoOuZ06Dv 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*xhxfJ_KyoOuZ06Dv 640w, https://miro.medium.com/v2/resize:fit:720/0*xhxfJ_KyoOuZ06Dv 720w, https://miro.medium.com/v2/resize:fit:750/0*xhxfJ_KyoOuZ06Dv 750w, https://miro.medium.com/v2/resize:fit:786/0*xhxfJ_KyoOuZ06Dv 786w, https://miro.medium.com/v2/resize:fit:828/0*xhxfJ_KyoOuZ06Dv 828w, https://miro.medium.com/v2/resize:fit:1100/0*xhxfJ_KyoOuZ06Dv 1100w, https://miro.medium.com/v2/resize:fit:1400/0*xhxfJ_KyoOuZ06Dv 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qd ff qe oc od qf qg bf b bg z du"><strong class="bf os">Figure 5. Flink Job Manager And Task Manager Setup</strong></figcaption></figure><h1 id="5f03" class="oq or gu bf os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn bk">Conclusion</h1><p id="3f1b" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">Over the last several years, USP has played a crucial role as a platform empowering numerous teams to achieve product personalization. Currently, USP processes over 1 million events per second across 100+ Flink jobs and the USP service serves 70k queries per second. For future work, we are looking into different types of asynchronous compute patterns via Flink to improve performance.</p><h1 id="16b9" class="oq or gu bf os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn bk">Acknowledgments</h1><p id="3175" class="pw-post-body-paragraph ne nf gu ng b nh po nj nk nl pp nn no np pq nr ns nt pr nv nw nx ps nz oa ob gn bk">USP is a collaborative effort between Airbnb’s Search Infrastructure and Stream Infrastructure, particularly Derrick Chie, Ran Zhang, Yi Li. Big thanks to our former teammates who contributed to this work: Emily Hsia, Youssef Francis, Swaroop Jagadish, Brandon Bevans, Zhi Feng, Wei Sun, Alex Tian, Wei Hou.</p></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/building-a-user-signals-platform-at-airbnb-b236078ec82b</link>
      <guid>https://medium.com/airbnb-engineering/building-a-user-signals-platform-at-airbnb-b236078ec82b</guid>
      <pubDate>Wed, 20 Nov 2024 20:27:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Airbnb’s AI-powered photo tour using Vision Transformer]]></title>
      <description><![CDATA[<div><div><h2 id="168a" class="pw-subtitle-paragraph hr gt gu bf b hs ht hu hv hw hx hy hz ia ib ic id ie if ig cq du">Boosting computer vision accuracy and performance at Airbnb</h2><div></div><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj nk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*QSGRcScNdh7js2oG 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*QSGRcScNdh7js2oG 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*QSGRcScNdh7js2oG 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*QSGRcScNdh7js2oG 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*QSGRcScNdh7js2oG 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*QSGRcScNdh7js2oG 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*QSGRcScNdh7js2oG 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*QSGRcScNdh7js2oG 640w, https://miro.medium.com/v2/resize:fit:720/0*QSGRcScNdh7js2oG 720w, https://miro.medium.com/v2/resize:fit:750/0*QSGRcScNdh7js2oG 750w, https://miro.medium.com/v2/resize:fit:786/0*QSGRcScNdh7js2oG 786w, https://miro.medium.com/v2/resize:fit:828/0*QSGRcScNdh7js2oG 828w, https://miro.medium.com/v2/resize:fit:1100/0*QSGRcScNdh7js2oG 1100w, https://miro.medium.com/v2/resize:fit:1400/0*QSGRcScNdh7js2oG 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="2850" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk"><strong class="ny gv">By:</strong> <a class="af os" href="https://www.linkedin.com/in/peixiong/" rel="noopener ugc nofollow" target="_blank">Pei Xiong</a>, <a class="af os" href="https://www.linkedin.com/in/xiaoxinyin/" rel="noopener ugc nofollow" target="_blank">Aaron Yin</a>, <a class="af os" href="https://www.linkedin.com/in/jian-zhang-3b013b2a/" rel="noopener ugc nofollow" target="_blank">Jian Zhang</a>, <a class="af os" href="https://www.linkedin.com/in/lifanyang/" rel="noopener ugc nofollow" target="_blank">Lifan Yang</a>, <a class="af os" href="https://www.linkedin.com/in/luzhangtracy/" rel="noopener ugc nofollow" target="_blank">Lu Zhang</a>, <a class="af os" href="https://www.linkedin.com/in/deanchen1/" rel="noopener ugc nofollow" target="_blank">Dean Chen</a></p><h1 id="e8b2" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Introduction</h1><p id="5d06" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">In recent years, the integration of artificial intelligence with travel platforms has transformed how people search for and book accommodations. As a leading global marketplace for unique travel experiences and accommodations, Airbnb constantly strives to enhance the guest experience by providing informative content about the variety of homes shared by our hosts. One of the ways we help guests better understand what a listing offers before they book is through our <a class="af os" href="https://www.airbnb.co.in/help/article/3509" rel="noopener ugc nofollow" target="_blank">AI-powered photo tour</a> feature.</p><p id="b8a4" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">The AI-poweredphoto tour in the Listings tab,which helps hosts better organize their listing photos, leverages vision transformers’ fine-tuned feature to assess a diverse set of listing images and accurately identify and classify photos based into specific rooms and spaces. In this blog post, we will dive into the inner workings of the photo tour including model selection, pretraining, fine-tuning techniques, and the trade-offs between computational costs and scalability. We will also specifically discuss how we enhanced model accuracy despite having limited training data.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*0fetI9mSr7qtJBun 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*0fetI9mSr7qtJBun 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*0fetI9mSr7qtJBun 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*0fetI9mSr7qtJBun 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*0fetI9mSr7qtJBun 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*0fetI9mSr7qtJBun 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*0fetI9mSr7qtJBun 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*0fetI9mSr7qtJBun 640w, https://miro.medium.com/v2/resize:fit:720/0*0fetI9mSr7qtJBun 720w, https://miro.medium.com/v2/resize:fit:750/0*0fetI9mSr7qtJBun 750w, https://miro.medium.com/v2/resize:fit:786/0*0fetI9mSr7qtJBun 786w, https://miro.medium.com/v2/resize:fit:828/0*0fetI9mSr7qtJBun 828w, https://miro.medium.com/v2/resize:fit:1100/0*0fetI9mSr7qtJBun 1100w, https://miro.medium.com/v2/resize:fit:1400/0*0fetI9mSr7qtJBun 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 1: Photo Tour product powered by ML</figcaption></figure><h1 id="41e7" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Methodology</h1><h1 id="035f" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Room Classification</h1><p id="3075" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Room-type classification is the first aspect of the photo tour, The goal of room classification is to accurately categorize images into 16 different room types designed in the Airbnb product such as ‘Bedroom’, ‘Full bathroom’, ‘Half bathroom’, ‘Living room’, and ‘Kitchen’, providing users with a comprehensive understanding of the available spaces. The challenge lies in the diversity of room layouts, lighting conditions, and the need for models that can generalize well across various environments.</p><p id="53ca" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We conducted experiments using several state-of-the-art models, including <a class="af os" href="https://arxiv.org/abs/2010.11929" rel="noopener ugc nofollow" target="_blank">Vision Transformer</a> (ViT) variants — ViT-base, ViT-large and different resolutions. Additionally, we explored the performance of <a class="af os" href="https://arxiv.org/abs/2301.00808" rel="noopener ugc nofollow" target="_blank">ConvNext2</a>, a recently proposed convolutional neural network with comparable performance to ViT, and <a class="af os" href="https://arxiv.org/abs/2204.01697" rel="noopener ugc nofollow" target="_blank">MaxVit</a>, a variant combining the strengths of both Vision Transformers and CNNs. At the beginning of this project, we tested these approaches on an image classification task with Airbnb’s host-provided data, and found that ViT outperforms the other approaches. Thus we chose ViT in our following studies.</p><h1 id="2562" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Image Similarity</h1><p id="0389" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Another key component of photo tour is image clustering, which groups the images of the same room into a cluster. A prerequisite of that is the ability to measure the similarity between two images, which indicates the probability that the two images belong to the same room. This is a supervised classification problem, with the input being two images, and the output being a binary label of 0 or 1. As shown in Figure 2, We employed a Siamese network that simultaneously processes two images, by applying the same image embedding model to each image, and subsequently computing the cosine similarity of the resulting embeddings.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj nk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*hOTMJ5eeNRizpYHS 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*hOTMJ5eeNRizpYHS 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*hOTMJ5eeNRizpYHS 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*hOTMJ5eeNRizpYHS 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*hOTMJ5eeNRizpYHS 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*hOTMJ5eeNRizpYHS 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*hOTMJ5eeNRizpYHS 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*hOTMJ5eeNRizpYHS 640w, https://miro.medium.com/v2/resize:fit:720/0*hOTMJ5eeNRizpYHS 720w, https://miro.medium.com/v2/resize:fit:750/0*hOTMJ5eeNRizpYHS 750w, https://miro.medium.com/v2/resize:fit:786/0*hOTMJ5eeNRizpYHS 786w, https://miro.medium.com/v2/resize:fit:828/0*hOTMJ5eeNRizpYHS 828w, https://miro.medium.com/v2/resize:fit:1100/0*hOTMJ5eeNRizpYHS 1100w, https://miro.medium.com/v2/resize:fit:1400/0*hOTMJ5eeNRizpYHS 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 2: An illustration of Siamese network for image similarity</figcaption></figure><h1 id="7a09" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Accuracy Improvement</h1><p id="cf3d" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Our analysis found that the volume of training data is key to higher prediction accuracy. Doubling the training data volume typically leads to a reduction of error rate of ≈5% on average, with the effect being more significant in the earlier stages.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj nk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*aRJ9s9sgGtULEkdX 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*aRJ9s9sgGtULEkdX 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*aRJ9s9sgGtULEkdX 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*aRJ9s9sgGtULEkdX 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*aRJ9s9sgGtULEkdX 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*aRJ9s9sgGtULEkdX 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*aRJ9s9sgGtULEkdX 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*aRJ9s9sgGtULEkdX 640w, https://miro.medium.com/v2/resize:fit:720/0*aRJ9s9sgGtULEkdX 720w, https://miro.medium.com/v2/resize:fit:750/0*aRJ9s9sgGtULEkdX 750w, https://miro.medium.com/v2/resize:fit:786/0*aRJ9s9sgGtULEkdX 786w, https://miro.medium.com/v2/resize:fit:828/0*aRJ9s9sgGtULEkdX 828w, https://miro.medium.com/v2/resize:fit:1100/0*aRJ9s9sgGtULEkdX 1100w, https://miro.medium.com/v2/resize:fit:1400/0*aRJ9s9sgGtULEkdX 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 3: correlation between data volume and accuracy</figcaption></figure><p id="1bc1" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Unfortunately, it is very expensive to acquire high-quality training data as it requires human labeling. Therefore, we needed to find other ways to improve model accuracy with a limited amount of training data. We followed these steps to improve model accuracy:</p><p id="7d2b" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk"><strong class="ny gv">Step 1 — Pre-training</strong>: We started from a pre-trained model on ImageNet. We took that model and trained it with a large amount of host-provided data, which has lower accuracy and only covers some of our class labels. This provided a baseline model for transfer learning in the following steps.</p><p id="53f5" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk"><strong class="ny gv">Step 2 — Multi-task training</strong>: We fine-tuned the model from the previous step using both higher-accuracy training data for the target task (e.g., room-type classification), and an additional type of training data that has been labeled for another related task (e.g., object detection). This provided additional training data and created multiple different models for future steps.</p><p id="04e3" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk"><strong class="ny gv">Step 3 — Ensemble learning</strong>: We created an ensemble from multiple models in Step 2, which was achieved through training with different auxiliary tasks, and by using different versions of ViTs (e.g., ViT-base vs. ViT-large, and/or those consuming images of size 224 vs 384). This approach allowed us to generate a diverse set of models, from which we selected the best performers to construct the final ensemble model.</p><p id="4a28" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk"><strong class="ny gv">Step 4 — Distillation</strong>: Although the ensemble model has higher accuracy than any individual model, it requires more computational resources and thus increases the latency and cost of our product. We trained a distilled model to imitate the behavior of the ensemble model, which has similar accuracy but reduced computational cost by several folds.</p><h1 id="fb06" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Pre-training and Traditional Fine-tuning</h1><p id="3272" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Our pretraining process involved harnessing the vast repository of Airbnb listing photos, comprising of millions of images, to train a Vision Transformer (ViT) model. While leveraging the Airbnb listing photos for pretraining provides a substantial advantage, there are also limitations in the dataset. There were inaccuracies or mislabels in the human-labeled dataset and they materially impacted the model’s ability to discern patterns effectively. Another notable limitation is the coverage of only four out of the total 16 room classifications within the pre-training dataset.</p><p id="ad0c" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Therefore, expanding the coverage of fine-tuning to include additional classes is imperative. We developed a detailed and updated guideline and generated a human-label dataset with the entirety of 16 room classifications. Iterative fine-tuning processes gradually encompassed the entirety of the 16 room types, contributing to a more comprehensive and versatile model.</p><h1 id="ad48" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Multi-task Learning</h1><p id="484d" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Acquiring high-quality human-labeled training data is a challenge due to the costly and time-consuming labeling process. Despite this, we had already accumulated a large repository of labeled data across other various tasks, including room-type classification, image quality prediction, same-room classification, category classification, and object detection. By fully utilizing this extensive and diversely labeled dataset, we significantly improved the prediction accuracy in our tasks. To achieve this, we implemented multi-task training that incorporates additional label classes from existing tasks, as demonstrated in Figure 4. Each learner is a vision transformer, and in addition to predicting a single set of labels, we allowed different learners to learn other label types, such as amenities and ImageNet21k labels, which further boosts overall performance as shown in Table 1.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj nk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*j94l6uJXX8Jc6cqK 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*j94l6uJXX8Jc6cqK 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*j94l6uJXX8Jc6cqK 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*j94l6uJXX8Jc6cqK 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*j94l6uJXX8Jc6cqK 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*j94l6uJXX8Jc6cqK 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*j94l6uJXX8Jc6cqK 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*j94l6uJXX8Jc6cqK 640w, https://miro.medium.com/v2/resize:fit:720/0*j94l6uJXX8Jc6cqK 720w, https://miro.medium.com/v2/resize:fit:750/0*j94l6uJXX8Jc6cqK 750w, https://miro.medium.com/v2/resize:fit:786/0*j94l6uJXX8Jc6cqK 786w, https://miro.medium.com/v2/resize:fit:828/0*j94l6uJXX8Jc6cqK 828w, https://miro.medium.com/v2/resize:fit:1100/0*j94l6uJXX8Jc6cqK 1100w, https://miro.medium.com/v2/resize:fit:1400/0*j94l6uJXX8Jc6cqK 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 4: Multi-task learning illustration</figcaption></figure><h1 id="b9f6" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Ensemble Learning</h1><p id="d315" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Ensemble learning is a powerful technique in machine learning that leverages diverse models with similar accuracies to achieve better accuracy and generalization.</p><p id="588c" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We applied ensemble learning on diverse models with different architectures, model sizes, and auxiliary tasks such as amenities and ImageNet21k class predictions. Upon aggregating the predictions of the individual models, we observed a notable increase in the overall accuracy compared to any single model. The observed improvement is credited to the ensemble’s capability to address and reduce both misclassifications and inaccuracies of individual models, leading to more accurate predictions, despite the limited human-labeled training data.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pz"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*kBMlip60KYSAYu9J 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*kBMlip60KYSAYu9J 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*kBMlip60KYSAYu9J 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*kBMlip60KYSAYu9J 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*kBMlip60KYSAYu9J 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*kBMlip60KYSAYu9J 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*kBMlip60KYSAYu9J 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*kBMlip60KYSAYu9J 640w, https://miro.medium.com/v2/resize:fit:720/0*kBMlip60KYSAYu9J 720w, https://miro.medium.com/v2/resize:fit:750/0*kBMlip60KYSAYu9J 750w, https://miro.medium.com/v2/resize:fit:786/0*kBMlip60KYSAYu9J 786w, https://miro.medium.com/v2/resize:fit:828/0*kBMlip60KYSAYu9J 828w, https://miro.medium.com/v2/resize:fit:1100/0*kBMlip60KYSAYu9J 1100w, https://miro.medium.com/v2/resize:fit:1400/0*kBMlip60KYSAYu9J 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="60c3" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Knowledge Distillation</h1><p id="5fdc" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">While ensemble learning offers substantial gains in accuracy, it requires heightened computational resources as multiple large models are involved in each inference task. To prioritize model efficiency without compromising performance, we turned to knowledge distillation, a technique centered around transferring knowledge from a sophisticated ensemble of models to a more compact single model.</p><p id="a56f" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Our distillation process transfers the knowledge encoded in both hard targets and the soft targets of a complex ensemble to a smaller and simpler model. Hard targets are ground-truth labels while the soft targets are the ensemble’s probabilistic predictions, enabling the smaller model to capture the nuanced decision boundaries learned by the ensemble. The overall training objective is a weighted combination of the two losses:</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div class="ni nj qa"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*j0iBmP9jQTRiAVZ_srkOpA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*j0iBmP9jQTRiAVZ_srkOpA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*j0iBmP9jQTRiAVZ_srkOpA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*j0iBmP9jQTRiAVZ_srkOpA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*j0iBmP9jQTRiAVZ_srkOpA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*j0iBmP9jQTRiAVZ_srkOpA.png 1100w, https://miro.medium.com/v2/resize:fit:744/format:webp/1*j0iBmP9jQTRiAVZ_srkOpA.png 744w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 372px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*j0iBmP9jQTRiAVZ_srkOpA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*j0iBmP9jQTRiAVZ_srkOpA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*j0iBmP9jQTRiAVZ_srkOpA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*j0iBmP9jQTRiAVZ_srkOpA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*j0iBmP9jQTRiAVZ_srkOpA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*j0iBmP9jQTRiAVZ_srkOpA.png 1100w, https://miro.medium.com/v2/resize:fit:744/1*j0iBmP9jQTRiAVZ_srkOpA.png 744w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 372px" /></picture></div></figure><p id="00c4" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">where the first loss is the cross-entropy loss based on hard targets, the second loss is Kullback-Leibler divergence to evaluate the cross entropy between soft targets from the ensemble and the predictions of the student model, and the distillation coefficient determines the weight assigned to the distillation loss.</p><p id="1cc6" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Remarkably, our distilled model achieved performance metrics on par with the ensemble models, despite its significantly reduced inference time and resource requirements. This outcome demonstrates the efficacy of knowledge distillation in preserving the ensemble’s collective intelligence within a more streamlined model.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div class="ni nj qb"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*Xd5kcCM9UglJ6eUR 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*Xd5kcCM9UglJ6eUR 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*Xd5kcCM9UglJ6eUR 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*Xd5kcCM9UglJ6eUR 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*Xd5kcCM9UglJ6eUR 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*Xd5kcCM9UglJ6eUR 1100w, https://miro.medium.com/v2/resize:fit:1188/format:webp/0*Xd5kcCM9UglJ6eUR 1188w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 594px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*Xd5kcCM9UglJ6eUR 640w, https://miro.medium.com/v2/resize:fit:720/0*Xd5kcCM9UglJ6eUR 720w, https://miro.medium.com/v2/resize:fit:750/0*Xd5kcCM9UglJ6eUR 750w, https://miro.medium.com/v2/resize:fit:786/0*Xd5kcCM9UglJ6eUR 786w, https://miro.medium.com/v2/resize:fit:828/0*Xd5kcCM9UglJ6eUR 828w, https://miro.medium.com/v2/resize:fit:1100/0*Xd5kcCM9UglJ6eUR 1100w, https://miro.medium.com/v2/resize:fit:1188/0*Xd5kcCM9UglJ6eUR 1188w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 594px" /></picture></div></figure><h1 id="628b" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Golden Evaluation</h1><p id="43a6" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">As part of the preparations for the launch of our end-to-end Photo Tour, we employed a rigorous evaluation process called “Golden Evaluation”, which mimics the actual user experience by calculating the minimum number of changes required to make the Photo Tour generated by our model identical to the human-labeled ground truth (i.e., the Golden Evaluation). In contrast to training data that is evenly distributed across classes, the golden evaluation processes at the Airbnb listing level, aiming to replicate the user’s perspective. We sampled listings, each containing an average of 25–30 photos, and defined accuracy as the <em class="qc">minimum number of corrections </em>required to make assignments consistent with human labels. These corrections refer to changes in room assignment, where a photo’s initial room prediction is modified to match the consensus room label provided by multiple human labels. For example, if a photo of bedroom 1 is falsely assigned to the living room, one correction is required to move it from the living room to bedroom 1.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div class="ni nj qd"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*3_mRqBAuGZabCa-mRWV1gA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*3_mRqBAuGZabCa-mRWV1gA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*3_mRqBAuGZabCa-mRWV1gA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*3_mRqBAuGZabCa-mRWV1gA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*3_mRqBAuGZabCa-mRWV1gA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*3_mRqBAuGZabCa-mRWV1gA.png 1100w, https://miro.medium.com/v2/resize:fit:1082/format:webp/1*3_mRqBAuGZabCa-mRWV1gA.png 1082w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 541px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*3_mRqBAuGZabCa-mRWV1gA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*3_mRqBAuGZabCa-mRWV1gA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*3_mRqBAuGZabCa-mRWV1gA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*3_mRqBAuGZabCa-mRWV1gA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*3_mRqBAuGZabCa-mRWV1gA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*3_mRqBAuGZabCa-mRWV1gA.png 1100w, https://miro.medium.com/v2/resize:fit:1082/1*3_mRqBAuGZabCa-mRWV1gA.png 1082w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 541px" /></picture></div></figure><p id="59cd" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">There are photos that cannot be properly assigned to a named space. We classified miscellaneous photos, including close-up shots, images containing humans or animals, as well as nearby photos of shopping areas, restaurants, and parks, into the category labeled as “Others”. Furthermore, if a photo is of an empty space in a room such that we cannot judge its room location, we are allowed to designate some photos as “Unassigned”, which do not count in the accuracy calculation. This scenario occurs infrequently (as shown in Table 3), and is primarily used to let users decide in the most ambiguous cases. This evaluation served as the final launch criteria. Ultimately, we successfully reduced the error rate to 5.28%, passing the internal evaluation standard at Airbnb and Photo Tour was launched as a showcase feature in the <a class="af os" href="https://news.airbnb.com/en-in/airbnb-2023-winter-release-introducing-guest-favorites-a-collection-of-the-2-million-most-loved-homes-on-airbnb/" rel="noopener ugc nofollow" target="_blank">November 2023 product launch</a>.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div class="ni nj qe"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*vZaBjpTpPhyKBlOi 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*vZaBjpTpPhyKBlOi 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*vZaBjpTpPhyKBlOi 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*vZaBjpTpPhyKBlOi 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*vZaBjpTpPhyKBlOi 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*vZaBjpTpPhyKBlOi 1100w, https://miro.medium.com/v2/resize:fit:1004/format:webp/0*vZaBjpTpPhyKBlOi 1004w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 502px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*vZaBjpTpPhyKBlOi 640w, https://miro.medium.com/v2/resize:fit:720/0*vZaBjpTpPhyKBlOi 720w, https://miro.medium.com/v2/resize:fit:750/0*vZaBjpTpPhyKBlOi 750w, https://miro.medium.com/v2/resize:fit:786/0*vZaBjpTpPhyKBlOi 786w, https://miro.medium.com/v2/resize:fit:828/0*vZaBjpTpPhyKBlOi 828w, https://miro.medium.com/v2/resize:fit:1100/0*vZaBjpTpPhyKBlOi 1100w, https://miro.medium.com/v2/resize:fit:1004/0*vZaBjpTpPhyKBlOi 1004w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 502px" /></picture></div></figure><h1 id="5d0a" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Conclusion</h1><p id="85ed" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Our exploration of using Vision Transformers to improve our photo tour product has been successful and rewarding. By incorporating pretraining, multi-task learning, ensemble learning, and knowledge distillation, we’ve significantly enhanced model accuracy. Pretraining provided a strong foundation, while multi-task learning enriched the model’s ability to interpret diverse visuals. Ensemble learning combined model strengths for robust predictions, and knowledge distillation enabled efficient deployment without sacrificing accuracy.</p><p id="bd98" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">The AI-powered photo tour was launched as part of Airbnb’s <a class="af os" href="https://news.airbnb.com/en-in/airbnb-2023-winter-release-introducing-guest-favorites-a-collection-of-the-2-million-most-loved-homes-on-airbnb/" rel="noopener ugc nofollow" target="_blank">2023 Winter Release</a>. Since then, we have been diligently monitoring the performance of this product and continue to refine our models further for an even more seamless user experience.</p><h1 id="3e08" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Acknowledgments</h1><p id="5e0c" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">We would like to thank everyone involved in the project. A special thanks to the entire Airbnb user, listing, and platform team for their relentless efforts in developing and launching the product, ensuring its continued excellence. Additionally, we extend our gratitude to the Airbnb Machine Learning Infra team for their crucial support in building a robust infrastructure that photo tour relies upon.</p><p id="b288" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">If this type of work interests you, check out some of our related <a class="af os" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">roles</a>!</p></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/airbnbs-ai-powered-photo-tour-using-vision-transformer-e470535f76d4</link>
      <guid>https://medium.com/airbnb-engineering/airbnbs-ai-powered-photo-tour-using-vision-transformer-e470535f76d4</guid>
      <pubDate>Wed, 13 Nov 2024 18:39:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Adopting Bazel for Web at Scale]]></title>
      <description><![CDATA[<figure class="gv gw gx gy gz ha gs gt paragraph-image"><div role="button" tabindex="0" class="hb hc fj hd bh he"><div class="gs gt gu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*uMA-yyBcSyRjQBwdQnbDdw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><div><div><h2 id="dfaa" class="pw-subtitle-paragraph ig hi hj bf b ih ii ij ik il im in io ip iq ir is it iu iv cq du">How and Why We Migrated Airbnb’s Large-Scale Web Monorepo to Bazel</h2><div></div><p id="3dee" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk"><strong class="ny hk">By:</strong> <a class="af os" href="https://www.linkedin.com/in/bbunge/" rel="noopener ugc nofollow" target="_blank">Brie Bunge</a> and <a class="af os" href="https://www.linkedin.com/in/sharmilajesupaul/" rel="noopener ugc nofollow" target="_blank">Sharmila Jesupaul</a></p><h1 id="227d" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Introduction</h1><p id="ab69" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">At Airbnb, we’ve recently adopted <a class="af os" href="https://bazel.build/" rel="noopener ugc nofollow" target="_blank">Bazel</a> — Google’s open source build tool–as our universal build system across backend, web, and <a class="af os" rel="noopener" href="https://medium.com/airbnb-engineering/migrating-our-ios-build-system-from-buck-to-bazel-ddd6f3f25aa3">iOS</a> platforms. This post will cover our experience adopting Bazel for Airbnb’s large-scale (over 11 million lines of code) web monorepo. We’ll share how we prepared the code base, the principles that guided the migration, and the process of migrating selected CI jobs. Our goal is to share information that would have been valuable to us when we embarked on this journey and to contribute to the growing discussion around Bazel for web development.</p><h1 id="698d" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Why did we do this?</h1><p id="eb0e" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Historically, we wrote bespoke build scripts and caching logic for various continuous integration (CI) jobs that proved challenging to maintain and consistently reached scaling limits as the repo grew. For example, our linter, <a class="af os" href="https://eslint.org/" rel="noopener ugc nofollow" target="_blank">ESLint</a>, and TypeScript’s type checking did not support multi-threaded concurrency out-of-the-box. We extended our unit testing tool, <a class="af os" href="https://jestjs.io/" rel="noopener ugc nofollow" target="_blank">Jest</a>, to be the runner for these tools because it had an API to leverage multiple workers.</p><p id="9ec5" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">It was not sustainable to continually create workarounds to overcome the inefficiencies of our tooling which did not support concurrency and we were incurring a long-run maintenance cost. To tackle these challenges and to best support our growing codebase, we found that Bazel’s sophistication, parallelism, caching, and performance fulfilled our needs.</p><p id="c7ba" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Additionally, Bazel is language agnostic. This facilitated consolidation onto a single, universal build system across Airbnb and allowed us to share common infrastructure and expertise. Now, an engineer who works on our backend monorepo can switch to the web monorepo and know how to build and test things.</p><h1 id="7c6e" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Why was this hard?</h1><p id="2081" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">When we began the migration in 2021, there was no publicized industry precedent for integrating Bazel with web at scale outside of Google. Open source tooling didn’t work out-of-the-box, and leveraging <a class="af os" href="https://bazel.build/remote/rbe" rel="noopener ugc nofollow" target="_blank">remote build execution</a> (RBE) introduced additional challenges. Our web codebase is large and contains many loose files, which led to performance issues when transmitting them to the remote environment. Additionally, we established migration principles that included improving or maintaining overall performance and reducing the impact on developers contributing to the monorepo during the transition. We effectively achieved both of these goals. Read on for more details.</p><h1 id="f50f" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Readying the Repository</h1><p id="8d95" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">We did some work up front to make the repository Bazel-ready–namely, cycle breaking and automated BUILD.bazel file generation.</p><h2 id="b949" class="pu ou hj bf ov pv pw dy oy px py ea pb of pz qa qb oj qc qd qe on qf qg qh qi bk">Cycle Breaking</h2><p id="18d1" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Our monorepo is laid out with projects under a top-level frontend/ directory. To start, we wanted to add BUILD.bazel files to each of the ~1000 top-level frontend directories. However, doing so created cycles in the dependency graph. This is not allowed in Bazel because there needs to be a <a class="af os" href="https://en.wikipedia.org/wiki/Directed_acyclic_graph" rel="noopener ugc nofollow" target="_blank">DAG</a> of build targets. Breaking these often felt like battling a hydra, as removing one cycle spawns more in its place. To accelerate the process, we modeled the problem as finding the <a class="af os" href="https://en.wikipedia.org/wiki/Feedback_arc_set" rel="noopener ugc nofollow" target="_blank">minimum feedback arc set (MFAS)</a> 1 to identify the minimal set of edges to remove leaving a <a class="af os" href="https://en.wikipedia.org/wiki/Directed_acyclic_graph" rel="noopener ugc nofollow" target="_blank">DAG</a>. This set presented the least disruption, level of effort, and surfaced pathological edges.</p><h2 id="6725" class="pu ou hj bf ov pv pw dy oy px py ea pb of pz qa qb oj qc qd qe on qf qg qh qi bk">Automated BUILD.bazel Generation</h2><p id="fd5c" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">We automatically generate BUILD.bazel files for the following reasons:</p><ol class=""><li id="80c7" class="nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or qj qk ql bk">Most contents are knowable from statically analyzable import / require statements.</li><li id="915f" class="nw nx hj ny b ih qm oa ob ik qn od oe of qo oh oi oj qp ol om on qq op oq or qj qk ql bk">Automation allowed us to quickly iterate on BUILD.bazel changes as we refined our rule definitions.</li><li id="bc18" class="nw nx hj ny b ih qm oa ob ik qn od oe of qo oh oi oj qp ol om on qq op oq or qj qk ql bk">It would take time for the migration to complete and we didn’t want to ask users to keep these files up-to-date when they weren’t yet gaining value from them.</li><li id="104c" class="nw nx hj ny b ih qm oa ob ik qn od oe of qo oh oi oj qp ol om on qq op oq or qj qk ql bk">Manually keeping these files up-to-date would constitute an additional Bazel tax, regressing the developer experience.</li></ol><p id="8ed0" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We have a CLI tool called sync-configs that generates dependency-based configurations in the monorepo (e.g., tsconfig.json, project configuration, now BUILD.bazel). It uses <a class="af os" href="https://github.com/jestjs/jest/tree/main/packages/jest-haste-map" rel="noopener ugc nofollow" target="_blank">jest-haste-map</a> and <a class="af os" href="https://facebook.github.io/watchman/" rel="noopener ugc nofollow" target="_blank">watchman</a> with a custom version of the <a class="af os" href="https://github.com/jestjs/jest/blob/main/packages/jest-haste-map/src/lib/dependencyExtractor.ts" rel="noopener ugc nofollow" target="_blank">dependencyExtractor</a> to determine the file-level dependency graph and part of <a class="af os" href="https://github.com/bazelbuild/bazel-gazelle" rel="noopener ugc nofollow" target="_blank">Gazelle</a> to emit BUILD.bazel files. This CLI tool is similar to <a class="af os" href="https://github.com/bazelbuild/bazel-gazelle" rel="noopener ugc nofollow" target="_blank">Gazelle</a> but also generates additional web specific configuration files such as tsconfig.json files used in TypeScript compilation.</p><h1 id="234e" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">CI Migration</h1><p id="5b70" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">With preparation work complete, we proceeded to migrate CI jobs to Bazel. This was a massive undertaking, so we divided the work into incremental milestones. We audited our CI jobs and chose to migrate the ones that would benefit the most: type checking, linting, and unit testing 2. To reduce the burden on our developers, we assigned the central Web Platform team the responsibility for porting CI jobs to Bazel. We proceeded one job at a time to deliver incremental value to developers sooner, gain confidence in our approach, focus our efforts, and build momentum. With each job, we ensured that the developer experience was high-quality, that performance improved, CI failures were reproducible locally, and that the tooling Bazel replaced was fully deprecated and removed.</p><h1 id="3986" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Enabling TypeScript</h1><p id="a5df" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">We started with the TypeScript (TS) CI job. We first tried the open source <a class="af os" href="https://github.com/bazelbuild/rules_nodejs/blob/5.x/nodejs/private/ts_project.bzl" rel="noopener ugc nofollow" target="_blank">ts_project rule</a> 3. However, it didn’t work well with RBE due to the sheer number of inputs, so we wrote a <a class="af os" href="https://gist.github.com/brieb/8439c7869fa058554c58377fb52a3c84" rel="noopener ugc nofollow" target="_blank">custom rule</a> to reduce the number and size of the inputs.</p><p id="6ae1" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">The biggest source of inputs came from <a class="af os" href="https://a0.muscache.com/im/pictures/airbnb-platform-assets/AirbnbPlatformAssets-Bazel%20blogpost/original/6826576e-79dc-4382-bc37-a62d9be3f597.png" rel="noopener ugc nofollow" target="_blank">node_modules</a>. Prior to this, the files for each npm package were being uploaded individually. Since Bazel works well with Java, we packaged up a full tar and a TS-specific tar (only containing the *.ts and package.json) for each npm package along the lines of Java JAR files (essentially zips).</p><p id="5f2f" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Another source of inputs came through transitive dependencies. Transitive node_modules and d.ts files in the sandbox were being included because technically they can be needed for subsequent project compilations. For example, suppose project foo depends on bar, and types from bar are exposed in foo’s emit. As a result, project baz which depends on foo would also need bar’s outputs in the sandbox. For long chains of dependencies, this can bloat the inputs significantly with files that aren’t actually needed. TypeScript has a <a class="af os" href="https://www.typescriptlang.org/tsconfig/#listFiles" rel="noopener ugc nofollow" target="_blank">— listFiles flag</a> that tells us which files are part of the compilation. We can package up this limited set of files along with the emitted d.ts files into an output tsc.tar.gz file 4. With this, targets need only include direct dependencies, rather than all transitive dependencies 5.</p><figure class="qs qt qu qv qw ha gs gt paragraph-image"><div role="button" tabindex="0" class="hb hc fj hd bh he"><div class="gs gt qr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*cKjuiMnl5KNyCRgG 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*cKjuiMnl5KNyCRgG 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*cKjuiMnl5KNyCRgG 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*cKjuiMnl5KNyCRgG 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*cKjuiMnl5KNyCRgG 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*cKjuiMnl5KNyCRgG 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*cKjuiMnl5KNyCRgG 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*cKjuiMnl5KNyCRgG 640w, https://miro.medium.com/v2/resize:fit:720/0*cKjuiMnl5KNyCRgG 720w, https://miro.medium.com/v2/resize:fit:750/0*cKjuiMnl5KNyCRgG 750w, https://miro.medium.com/v2/resize:fit:786/0*cKjuiMnl5KNyCRgG 786w, https://miro.medium.com/v2/resize:fit:828/0*cKjuiMnl5KNyCRgG 828w, https://miro.medium.com/v2/resize:fit:1100/0*cKjuiMnl5KNyCRgG 1100w, https://miro.medium.com/v2/resize:fit:1400/0*cKjuiMnl5KNyCRgG 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qx ff qy gs gt qz ra bf b bg z du"><em class="rb">Diagram showing how we use tars and the — listFiles flag to prune inputs/outputs of :types targets</em></figcaption></figure><p id="e8de" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">This custom rule unblocked switching to Bazel for TypeScript, as the job was now well under our CI runtime budget.</p><figure class="qs qt qu qv qw ha gs gt paragraph-image"><div role="button" tabindex="0" class="hb hc fj hd bh he"><div class="gs gt qr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*BJB0TroGRohVvjAS 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*BJB0TroGRohVvjAS 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*BJB0TroGRohVvjAS 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*BJB0TroGRohVvjAS 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*BJB0TroGRohVvjAS 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*BJB0TroGRohVvjAS 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*BJB0TroGRohVvjAS 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*BJB0TroGRohVvjAS 640w, https://miro.medium.com/v2/resize:fit:720/0*BJB0TroGRohVvjAS 720w, https://miro.medium.com/v2/resize:fit:750/0*BJB0TroGRohVvjAS 750w, https://miro.medium.com/v2/resize:fit:786/0*BJB0TroGRohVvjAS 786w, https://miro.medium.com/v2/resize:fit:828/0*BJB0TroGRohVvjAS 828w, https://miro.medium.com/v2/resize:fit:1100/0*BJB0TroGRohVvjAS 1100w, https://miro.medium.com/v2/resize:fit:1400/0*BJB0TroGRohVvjAS 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qx ff qy gs gt qz ra bf b bg z du"><em class="rb">Bar chart showing the speed up from switching to using our custom genrule</em></figcaption></figure><h1 id="74c1" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Enabling ESLint</h1><p id="56cc" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">We migrated the <a class="af os" href="https://eslint.org/" rel="noopener ugc nofollow" target="_blank">ESLint</a> job next. Bazel works best with actions that are independent and have a narrow set of inputs. Some of our lint rules (e.g., special internal rules, <a class="af os" href="https://github.com/import-js/eslint-plugin-import/blob/main/docs/rules/export.md" rel="noopener ugc nofollow" target="_blank">import/export</a>, <a class="af os" href="https://github.com/import-js/eslint-plugin-import/blob/main/docs/rules/extensions.md" rel="noopener ugc nofollow" target="_blank">import/extensions</a>) inspected files outside of the linted file. We restricted our lint rules to those that could operate in isolation as a way of reducing input size and having only to lint directly affected files. This meant moving or deleting lint rules (e.g., those that were made redundant with TypeScript). As a result, we reduced CI times by over 70%.</p><figure class="qs qt qu qv qw ha gs gt paragraph-image"><div role="button" tabindex="0" class="hb hc fj hd bh he"><div class="gs gt qr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*jxLe6RvjzyINaahq 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*jxLe6RvjzyINaahq 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*jxLe6RvjzyINaahq 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*jxLe6RvjzyINaahq 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*jxLe6RvjzyINaahq 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*jxLe6RvjzyINaahq 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*jxLe6RvjzyINaahq 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*jxLe6RvjzyINaahq 640w, https://miro.medium.com/v2/resize:fit:720/0*jxLe6RvjzyINaahq 720w, https://miro.medium.com/v2/resize:fit:750/0*jxLe6RvjzyINaahq 750w, https://miro.medium.com/v2/resize:fit:786/0*jxLe6RvjzyINaahq 786w, https://miro.medium.com/v2/resize:fit:828/0*jxLe6RvjzyINaahq 828w, https://miro.medium.com/v2/resize:fit:1100/0*jxLe6RvjzyINaahq 1100w, https://miro.medium.com/v2/resize:fit:1400/0*jxLe6RvjzyINaahq 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qx ff qy gs gt qz ra bf b bg z du"><em class="rb">Time series graph showing the runtime speed-up in early May from only running ESLint on directly affected targets</em></figcaption></figure><h1 id="c4df" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Enabling Jest</h1><p id="5a30" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Our next challenge was enabling <a class="af os" href="https://jestjs.io" rel="noopener ugc nofollow" target="_blank">Jest</a>. This presented unique challenges, as we needed to bring along a much larger set of first and third-party dependencies, and there were more Bazel-specific failures to fix.</p><h2 id="59b6" class="pu ou hj bf ov pv pw dy oy px py ea pb of pz qa qb oj qc qd qe on qf qg qh qi bk">Worker and Docker Cache</h2><p id="6263" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">We tarred up dependencies to reduce input size, but extraction was still slow. To address this, we introduced caching. One layer of cache is on the remote worker and another is on the worker’s Docker container, baked into the image at build time. The Docker layer exists to avoid losing our cache when remote workers are auto-scaled. We run a cron job once a week to update the Docker image with the newest set of cached dependencies, striking a balance of keeping them fresh while avoiding image thrashing. For more details, check out <a class="af os" href="https://blog.engflow.com/2023/06/01/bazel-community-day--san-francisco/#taming-node_modules-in-rbe-airbnbs-journey-sharmila-jesupaul-airbnb" rel="noopener ugc nofollow" target="_blank">this Bazel Community Day talk</a>.</p><figure class="qs qt qu qv qw ha gs gt paragraph-image"><div role="button" tabindex="0" class="hb hc fj hd bh he"><div class="gs gt qr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*e7zywi2UKNMa9qTF 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*e7zywi2UKNMa9qTF 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*e7zywi2UKNMa9qTF 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*e7zywi2UKNMa9qTF 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*e7zywi2UKNMa9qTF 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*e7zywi2UKNMa9qTF 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*e7zywi2UKNMa9qTF 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*e7zywi2UKNMa9qTF 640w, https://miro.medium.com/v2/resize:fit:720/0*e7zywi2UKNMa9qTF 720w, https://miro.medium.com/v2/resize:fit:750/0*e7zywi2UKNMa9qTF 750w, https://miro.medium.com/v2/resize:fit:786/0*e7zywi2UKNMa9qTF 786w, https://miro.medium.com/v2/resize:fit:828/0*e7zywi2UKNMa9qTF 828w, https://miro.medium.com/v2/resize:fit:1100/0*e7zywi2UKNMa9qTF 1100w, https://miro.medium.com/v2/resize:fit:1400/0*e7zywi2UKNMa9qTF 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qx ff qy gs gt qz ra bf b bg z du"><em class="rb">Diagram showing symlinked npm dependencies to a Docker cache and worker cache</em></figcaption></figure><p id="dd3c" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">This added caching provided us with a ~25% speed up of our Jest unit testing CI job overall and reduced the time to extract our dependencies from 1–3 minutes to 3–7 seconds per target. This implementation required us to enable the NodeJS <a class="af os" href="https://nodejs.org/api/cli.html#--preserve-symlinks" rel="noopener ugc nofollow" target="_blank">preserve-symlinks</a> option and patch some of our tools that followed symlinks to their real paths. We extended this caching strategy to our <a class="af os" href="https://github.com/jestjs/jest/tree/main/packages/babel-jest" rel="noopener ugc nofollow" target="_blank">Babel</a> transformation cache, another source of poor performance.</p><h2 id="d47d" class="pu ou hj bf ov pv pw dy oy px py ea pb of pz qa qb oj qc qd qe on qf qg qh qi bk">Implicit Dependencies</h2><p id="d5ff" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Next, we needed to fix Bazel-specific test failures. Most of these were due to missing files. For any inputs not statically analyzable (e.g., referenced as a string without an import, babel plugin string referenced in .babelrc), we added support for a Bazel keep comment (e.g., // bazelKeep: path/to/file) which acts as though the file were imported. The advantages of this approach are:</p><p id="e5ba" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">1. It is colocated with the code that uses the dependency,</p><p id="55e0" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">2. BUILD.bazel files don’t need to be manually edited to add/move <a class="af os" href="https://github.com/bazelbuild/bazel-gazelle?tab=readme-ov-file#keep-comments" rel="noopener ugc nofollow" target="_blank"># keep comments</a>,</p><p id="f6e2" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">3. There is no effect on runtime.</p><p id="028e" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">A small number of tests were unsuitable for Bazel because they required a large view of the repository or a dynamic and implicit set of dependencies. We moved these tests out of our unit testing job to separate CI checks.</p><h2 id="3737" class="pu ou hj bf ov pv pw dy oy px py ea pb of pz qa qb oj qc qd qe on qf qg qh qi bk">Preventing Backsliding</h2><p id="87da" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">With over 20,000 test files and hundreds of people actively working in the same repository, we needed to pursue test fixes such that they would not be undone as product development progressed.</p><p id="9301" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Our CI has three types of build queues:</p><p id="128e" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">1. “Required”, which blocks changes,</p><p id="9ed5" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">2. “Optional”, which is non-blocking,</p><p id="24b8" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">3. “Hidden”, which is non-blocking and not shown on PRs.</p><p id="d527" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">As we fixed tests, we moved them from “hidden” to “required” via a rule attribute. To ensure a single source of truth, tests run in “required” under Bazel were not run under the Jest setup being replaced.</p><pre class="qs qt qu qv qw rc rd re bp rf bb bk"># frontend/app/script/__tests__/BUILD.bazel<br />jest_test(<br />    name = "jest_test",<br />    is_required = True, # makes this target a required check on pull requests <br />    deps = [<br />        ":source_library",<br />    ],<br />)</pre><p id="9878" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk"><em class="rl">Example jest_test rule. This signifies that this target will run on the “required” build queue.</em></p><p id="347e" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We wrote a script comparing before and after Bazel to determine migration-readiness, using the metrics of test runtime, code coverage stats, and failure rate. Fortunately, the bulk of tests could be enabled without additional changes, so we enabled these in batches. We divided and conquered the remaining burndown list of failures with the central team, Web Platform, fixing and updating tests in Bazel to avoid putting this burden on our developers. After a grace period, we fully disabled and deleted the non-Bazel Jest infrastructure and removed the is_required param.</p><h1 id="8272" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Local Bazel Experience</h1><p id="1574" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">In tandem with our CI migration, we ensured that developers can run Bazel locally to reproduce and iterate on CI failures. Our migration principles included delivering only what was on par with or superior to the existing developer experience and performance. JavaScript tools have developer-friendly CLI experiences (e.g., watch mode, targeting select files, rich interactivity) and IDE integrations that we wanted to retain. By default, frontend developers can continue using the tools they know and love, and in cases where it is beneficial they can opt into Bazel. Discrepancies between Bazel and non-Bazel are rare and when they do occur, developers have a means of resolving the issue. For example, developers can run a single script, failed-on-pr which will re-run any targets failing CI locally to easily reproduce issues.</p><figure class="qs qt qu qv qw ha gs gt paragraph-image"><div role="button" tabindex="0" class="hb hc fj hd bh he"><div class="gs gt qr"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*IqcxStamyg_zPexr 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*IqcxStamyg_zPexr 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*IqcxStamyg_zPexr 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*IqcxStamyg_zPexr 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*IqcxStamyg_zPexr 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*IqcxStamyg_zPexr 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*IqcxStamyg_zPexr 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*IqcxStamyg_zPexr 640w, https://miro.medium.com/v2/resize:fit:720/0*IqcxStamyg_zPexr 720w, https://miro.medium.com/v2/resize:fit:750/0*IqcxStamyg_zPexr 750w, https://miro.medium.com/v2/resize:fit:786/0*IqcxStamyg_zPexr 786w, https://miro.medium.com/v2/resize:fit:828/0*IqcxStamyg_zPexr 828w, https://miro.medium.com/v2/resize:fit:1100/0*IqcxStamyg_zPexr 1100w, https://miro.medium.com/v2/resize:fit:1400/0*IqcxStamyg_zPexr 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qx ff qy gs gt qz ra bf b bg z du"><em class="rb">Annotations on a failing build with scripts to recreate the failures, e.g. yak script jest:failed-on-pr</em></figcaption></figure><p id="3f8f" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We also do some normalization of platform specific binaries so that we can reuse the cache between Linux and MacOS builds. This speeds up local development and CI jobs by sharing cache between a local developer’s macbook and linux machines in CI. For native npm packages (<a class="af os" href="https://github.com/nodejs/node-gyp" rel="noopener ugc nofollow" target="_blank">node-gyp</a> dependencies) we exclude platform-specific files and build the package on the execution machine. The execution machine will be the machine executing the test or build process. We also use “universal binaries” (e.g., for node and zstd), where all platform binaries are included as inputs (so that inputs are consistent no matter which platform the action is run from) and the proper binary is <a class="af os" href="https://gist.github.com/brieb/3c0fdb614122e928b4546c5d85c97ab3" rel="noopener ugc nofollow" target="_blank">chosen at runtime</a>.</p><h1 id="1f29" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Conclusion</h1><p id="b136" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Adopting Bazel for our core CI jobs yielded significant performance improvements for TypeScript type checking (34% faster), ESLint linting (35% faster), and Jest unit tests (42% faster incremental runs, 29% overall). Moreover, our CI can now better scale as the repo grows.</p><p id="55c4" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Next, to further improve Bazel performance, we will be focusing on persisting a warm Bazel host across CI runs, taming our build graph, powering CI jobs that do not use Bazel with the Bazel build graph, and potentially exploring <a class="af os" href="https://en.wikipedia.org/wiki/SquashFS" rel="noopener ugc nofollow" target="_blank">SquashFS</a> to further compress and optimize our Bazel sandboxes.</p><p id="63a9" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We hope that sharing our journey has provided insights for organizations considering a Bazel migration for web.</p><h1 id="9171" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Acknowledgments</h1><p id="453a" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Thank you Madison Capps, Meghan Dow, Matt Insler, Janusz Kudelka, Joe Lencioni, Rae Liu, James Robinson, Joel Snyder, Elliott Sprehn, Fanying Ye, and various other internal and external partners who helped bring Bazel to Airbnb.</p><p id="980b" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We are also grateful to the broader Bazel community for being welcoming and sharing ideas.</p><h1 id="65fa" class="ot ou hj bf ov ow ox ij oy oz pa im pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">****************</h1><p id="c4da" class="pw-post-body-paragraph nw nx hj ny b ih pp oa ob ik pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">[1]: This problem is <a class="af os" href="https://en.wikipedia.org/wiki/NP-completeness" rel="noopener ugc nofollow" target="_blank">NP-complete</a>, though approximation algorithms have been devised that still guarantee no cycles; we chose the <a class="af os" href="https://github.com/zhenv5/breaking_cycles_in_noisy_hierarchies" rel="noopener ugc nofollow" target="_blank">implementation</a> outlined in “<a class="af os" href="https://dl.acm.org/doi/pdf/10.1145/3091478.3091495" rel="noopener ugc nofollow" target="_blank">Breaking Cycles in Noisy Hierarchies</a>”.</p><p id="2d1c" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">[2]: After initial evaluation, we considered migrating web asset bundling as out of scope (though we may revisit this in the future) due to high level of effort, unknowns in the bundler landscape, and neutral return on investment given our recent adoption of <a class="af os" rel="noopener" href="https://medium.com/airbnb-engineering/faster-javascript-builds-with-metro-cfc46d617a1f">Metro</a>, as Metro’s architecture already factors in scalability features (e.g. parallelism, local and remote caching, and incremental builds).</p><p id="b864" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">[3]: There are newer TS rules that may work well for you <a class="af os" href="https://github.com/aspect-build/rules_ts" rel="noopener ugc nofollow" target="_blank">here</a>.</p><p id="3098" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">[4]: We later switched to using <a class="af os" href="https://github.com/facebook/zstd" rel="noopener ugc nofollow" target="_blank">zstd</a> instead of gzip because it produces archives that are better compressed and more deterministic, keeping tarballs consistent across different platforms.</p><p id="79f9" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">[5]: While unnecessary files may still be included, it’s a much narrower set (and could be pruned as a further optimization).</p><p id="a7cd" class="pw-post-body-paragraph nw nx hj ny b ih nz oa ob ik oc od oe of og oh oi oj ok ol om on oo op oq or gn bk"><em class="rl">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/adopting-bazel-for-web-at-scale-a784b2dbe325</link>
      <guid>https://medium.com/airbnb-engineering/adopting-bazel-for-web-at-scale-a784b2dbe325</guid>
      <pubDate>Tue, 12 Nov 2024 19:22:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Transforming Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Learning]]></title>
      <description><![CDATA[<div><div></div><figure class="mz na nb nc nd ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx my"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*ntAY9EP682xs6adB 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*ntAY9EP682xs6adB 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*ntAY9EP682xs6adB 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*ntAY9EP682xs6adB 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*ntAY9EP682xs6adB 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*ntAY9EP682xs6adB 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*ntAY9EP682xs6adB 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*ntAY9EP682xs6adB 640w, https://miro.medium.com/v2/resize:fit:720/0*ntAY9EP682xs6adB 720w, https://miro.medium.com/v2/resize:fit:750/0*ntAY9EP682xs6adB 750w, https://miro.medium.com/v2/resize:fit:786/0*ntAY9EP682xs6adB 786w, https://miro.medium.com/v2/resize:fit:828/0*ntAY9EP682xs6adB 828w, https://miro.medium.com/v2/resize:fit:1100/0*ntAY9EP682xs6adB 1100w, https://miro.medium.com/v2/resize:fit:1400/0*ntAY9EP682xs6adB 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="478b" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">How Airbnb leverages machine learning and reinforcement learning techniques to solve a unique information retrieval task in order to provide guests with unique, affordable, and differentiated accommodations around the world.</p><p id="2252" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><strong class="nm gv">By:</strong> <a class="af oi" href="https://www.linkedin.com/in/dillon-davis/" rel="noopener ugc nofollow" target="_blank">Dillon Davis</a>, <a class="af oi" href="https://www.linkedin.com/in/huiji-gao/" rel="noopener ugc nofollow" target="_blank">Huiji Gao,</a> <a class="af oi" href="https://www.linkedin.com/in/thomaslegrand1/" rel="noopener ugc nofollow" target="_blank">Thomas Legrand</a>, <a class="af oi" href="https://www.linkedin.com/in/weiwei-guo/" rel="noopener ugc nofollow" target="_blank">Weiwei Guo</a>, <a class="af oi" href="https://www.linkedin.com/in/malayhaldar/" rel="noopener ugc nofollow" target="_blank">Malay Haldar</a>, <a class="af oi" href="https://www.linkedin.com/in/alex-shaojie-deng-b572347/" rel="noopener ugc nofollow" target="_blank">Alex Deng</a>, <a class="af oi" href="https://www.linkedin.com/in/han-zhao-692944116/" rel="noopener ugc nofollow" target="_blank">Han Zhao</a>, <a class="af oi" href="https://www.linkedin.com/in/liweihe/" rel="noopener ugc nofollow" target="_blank">Liwei He</a>, <a class="af oi" href="https://www.linkedin.com/in/sanjeevkatariya/" rel="noopener ugc nofollow" target="_blank">Sanjeev Katariya</a></p><h1 id="70a9" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">Introduction</h1><p id="2509" class="pw-post-body-paragraph nk nl gu nm b nn ph np nq nr pi nt nu nv pj nx ny nz pk ob oc od pl of og oh gn bk"><em class="pm">Airbnb has transformed the way people travel around the globe. As Airbnb’s inventory spans diverse locations and property types, providing guests with relevant options in their search results has become increasingly complex. In this blog post, we’ll discuss shifting from using simple heuristics to advanced machine learning and reinforcement learning techniques to transform what we call location retrieval in order to address this challenge.</em></p><h1 id="ed27" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">The Challenge of Location Retrieval</h1><p id="a9ea" class="pw-post-body-paragraph nk nl gu nm b nn ph np nq nr pi nt nu nv pj nx ny nz pk ob oc od pl of og oh gn bk"><em class="pm">Guests typically start searching by entering a destination in the search bar and expect the most relevant results to be surfaced. These destinations can be countries, states, cities, neighborhoods, streets, addresses, or points of interest. Unlike traditional travel accommodations, Airbnb listings are spread across different neighborhoods and surrounding areas. For example, a family searching for a vacation rental in San Francisco might find better options in nearby cities like Daly City, where there are larger single-family homes. Thus, the system needs to account for not just the searched location but also nearby areas that might offer better options for the guest. This is evidenced by the locations of booked listings when searching for San Francisco shown below.</em></p><figure class="pn po pp pq pr ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx my"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*hxCpOroNnkEPLz9X 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*hxCpOroNnkEPLz9X 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*hxCpOroNnkEPLz9X 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*hxCpOroNnkEPLz9X 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*hxCpOroNnkEPLz9X 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*hxCpOroNnkEPLz9X 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*hxCpOroNnkEPLz9X 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*hxCpOroNnkEPLz9X 640w, https://miro.medium.com/v2/resize:fit:720/0*hxCpOroNnkEPLz9X 720w, https://miro.medium.com/v2/resize:fit:750/0*hxCpOroNnkEPLz9X 750w, https://miro.medium.com/v2/resize:fit:786/0*hxCpOroNnkEPLz9X 786w, https://miro.medium.com/v2/resize:fit:828/0*hxCpOroNnkEPLz9X 828w, https://miro.medium.com/v2/resize:fit:1100/0*hxCpOroNnkEPLz9X 1100w, https://miro.medium.com/v2/resize:fit:1400/0*hxCpOroNnkEPLz9X 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="31fd" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">Given Airbnb’s scale, we </em><strong class="nm gv"><em class="pm">cannot rank every listing for every search</em></strong><em class="pm">. This presented a challenge to create a system that dynamically infers a relevant map area for a query. This system, known as location retrieval, needed to balance including a wide variety of listings to appeal to all guests’ needs while still being relevant to the query. Our search ranking models can then efficiently rank the subset of our inventory that is within the relevant map area and surface the </em><strong class="nm gv"><em class="pm">most relevant </em></strong><em class="pm">inventory to our guests. This system and more is outlined below</em></p><figure class="pn po pp pq pr ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx my"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*yAquzHujJys9Zh5d 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*yAquzHujJys9Zh5d 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*yAquzHujJys9Zh5d 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*yAquzHujJys9Zh5d 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*yAquzHujJys9Zh5d 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*yAquzHujJys9Zh5d 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*yAquzHujJys9Zh5d 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*yAquzHujJys9Zh5d 640w, https://miro.medium.com/v2/resize:fit:720/0*yAquzHujJys9Zh5d 720w, https://miro.medium.com/v2/resize:fit:750/0*yAquzHujJys9Zh5d 750w, https://miro.medium.com/v2/resize:fit:786/0*yAquzHujJys9Zh5d 786w, https://miro.medium.com/v2/resize:fit:828/0*yAquzHujJys9Zh5d 828w, https://miro.medium.com/v2/resize:fit:1100/0*yAquzHujJys9Zh5d 1100w, https://miro.medium.com/v2/resize:fit:1400/0*yAquzHujJys9Zh5d 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="2f8c" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">Starting with Heuristics: The Cold Start Problem</h1><p id="54e5" class="pw-post-body-paragraph nk nl gu nm b nn ph np nq nr pi nt nu nv pj nx ny nz pk ob oc od pl of og oh gn bk"><em class="pm">Initially, Airbnb relied on heuristics to define map areas based on the type of search. For example, if a guest searched for a country, the system would use administrative boundaries to filter listings within that country. If they searched for a city, the system would create a 25-mile radius around the city center to retrieve listings.</em></p><p id="7c35" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">Improving these heuristics proved to be profoundly impactful. One such example is the introduction of a log scale parameterized smooth function to compute an expansion factor for the diagonal size of the administrative bounds of the searched destination. We applied this for very precise locations like addresses, buildings, and POI’s resulting in a 0.35% increase in uncancelled bookers on the platform when tested in an online A/B experiment against the baseline heuristics. Figures below demonstrate how search results for a building in Ibiza, Spain improved dramatically with this heuristic by surfacing significantly more and higher quality inventory.</em></p><figure class="pn po pp pq pr ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx my"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*UaxuoXvZzqydrQCq 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*UaxuoXvZzqydrQCq 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*UaxuoXvZzqydrQCq 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*UaxuoXvZzqydrQCq 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*UaxuoXvZzqydrQCq 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*UaxuoXvZzqydrQCq 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*UaxuoXvZzqydrQCq 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*UaxuoXvZzqydrQCq 640w, https://miro.medium.com/v2/resize:fit:720/0*UaxuoXvZzqydrQCq 720w, https://miro.medium.com/v2/resize:fit:750/0*UaxuoXvZzqydrQCq 750w, https://miro.medium.com/v2/resize:fit:786/0*UaxuoXvZzqydrQCq 786w, https://miro.medium.com/v2/resize:fit:828/0*UaxuoXvZzqydrQCq 828w, https://miro.medium.com/v2/resize:fit:1100/0*UaxuoXvZzqydrQCq 1100w, https://miro.medium.com/v2/resize:fit:1400/0*UaxuoXvZzqydrQCq 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><figure class="pn po pp pq pr ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx my"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*pIfZ6a4zoqW1PjYp 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*pIfZ6a4zoqW1PjYp 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*pIfZ6a4zoqW1PjYp 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*pIfZ6a4zoqW1PjYp 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*pIfZ6a4zoqW1PjYp 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*pIfZ6a4zoqW1PjYp 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*pIfZ6a4zoqW1PjYp 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*pIfZ6a4zoqW1PjYp 640w, https://miro.medium.com/v2/resize:fit:720/0*pIfZ6a4zoqW1PjYp 720w, https://miro.medium.com/v2/resize:fit:750/0*pIfZ6a4zoqW1PjYp 750w, https://miro.medium.com/v2/resize:fit:786/0*pIfZ6a4zoqW1PjYp 786w, https://miro.medium.com/v2/resize:fit:828/0*pIfZ6a4zoqW1PjYp 828w, https://miro.medium.com/v2/resize:fit:1100/0*pIfZ6a4zoqW1PjYp 1100w, https://miro.medium.com/v2/resize:fit:1400/0*pIfZ6a4zoqW1PjYp 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="f745" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">These heuristics were simple and worked well enough to start, but they had limitations. They couldn’t differentiate between different types of searches (e.g., a family looking for a large home versus a solo traveler looking for a small apartment), and they didn’t adapt well to new data as Airbnb’s inventory and guest preferences evolved.</em></p><h1 id="762d" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">Exploring Statistics to Help Improve Location Retrieval</h1><p id="f27e" class="pw-post-body-paragraph nk nl gu nm b nn ph np nq nr pi nt nu nv pj nx ny nz pk ob oc od pl of og oh gn bk"><em class="pm">With more data available over time from these intuition based heuristics, we thought there might be a way to take advantage of this historical user booking behavior to improve location retrieval. We built a dataset for each travel destination that recorded where guests booked listings when searching for that destination. Based on this data, the system could create retrieval map areas that included 96% of the nearest booked listings for a given destination.</em></p><p id="f3c4" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">We tested these newly constructed retrieval map areas in lieu of the intuition based heuristics outlined above based on the hypothesis that it would provide guests a more bookable selection of inventory. While this statistical approach was more aligned with guest booking behavior, it still had limitations. It treated all searches for a location the same, regardless of specific search parameters like group size or travel dates. This uniform approach meant that some guests might not see the best listings for their particular needs. As a result, this statistics based method had no detectable increase in uncancelled bookers on the platform when tested against the heuristics outlined above in an online A/B experiment. This led us to believe that location retrieval may require more advanced techniques such as machine learning.</em></p><h1 id="6905" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">Advancing to Machine Learning</h1><p id="3099" class="pw-post-body-paragraph nk nl gu nm b nn ph np nq nr pi nt nu nv pj nx ny nz pk ob oc od pl of og oh gn bk"><em class="pm">Instead of only relying on past booking data, the new system could learn from various search parameters, such as the number of guests and stay duration. By analyzing this data, a model could predict more relevant map areas for each search, rather than applying a one-size-fits-all approach.</em></p><p id="852c" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">For example, a group of ten travelers searching for a San Francisco vacation rental might prefer larger homes in the suburbs, while solo travelers might prioritize central locations. The machine learning model could distinguish between these different preferences and adjust the retrieval map areas accordingly, providing more tailored results.</em></p><p id="a689" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">We constructed our machine learning model in the following manner. This is a result of three iterations that introduced the machine learning model, expanded its feature set, and expanded search attribution. The architecture is depicted in the figure below.</em></p><ol class=""><li id="ac3b" class="nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh ps pt pu bk"><em class="pm">Training Examples: Searches issued by a booker by entering a destination in the search bar or manipulating the map that contained the booked listing in their search results on the same day or one day before the booking. We discard any bookings that are canceled 7 days after booking.</em></li><li id="9ce7" class="nk nl gu nm b nn pv np nq nr pw nt nu nv px nx ny nz py ob oc od pz of og oh ps pt pu bk"><em class="pm">Training Features: We derive features directly from the search request such as location name, stay length, number of guests, price filters, location country, etc. There are 9 continuous features and 19 categorical features in total.</em></li><li id="0824" class="nk nl gu nm b nn pv np nq nr pw nt nu nv px nx ny nz py ob oc od pz of og oh ps pt pu bk"><em class="pm">Training Labels: The latitude and longitude coordinates of the booked listing attributed to the search</em></li><li id="53c4" class="nk nl gu nm b nn pv np nq nr pw nt nu nv px nx ny nz py ob oc od pz of og oh ps pt pu bk"><em class="pm">Architecture: A two layer neural network of size 256 was chosen in order to have more flexibility for loss formulation compared to traditional regression and decision tree based approaches.</em></li><li id="aa21" class="nk nl gu nm b nn pv np nq nr pw nt nu nv px nx ny nz py ob oc od pz of og oh ps pt pu bk"><em class="pm">Model Output: 4 floats that define the latitude and longitude offsets from the center latitude and longitude coordinates of the searched destination that represent the relevant map area.</em></li><li id="cc3b" class="nk nl gu nm b nn pv np nq nr pw nt nu nv px nx ny nz py ob oc od pz of og oh ps pt pu bk"><em class="pm">Loss: Trained to predict map areas that contain their associated booked listing while minimizing the size of the predicted map area and the occurrence of predictions that cannot construct a valid rectangular map area.</em></li></ol><figure class="pn po pp pq pr ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx my"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*WBuiuf_DdU96xy7j 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*WBuiuf_DdU96xy7j 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*WBuiuf_DdU96xy7j 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*WBuiuf_DdU96xy7j 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*WBuiuf_DdU96xy7j 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*WBuiuf_DdU96xy7j 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*WBuiuf_DdU96xy7j 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*WBuiuf_DdU96xy7j 640w, https://miro.medium.com/v2/resize:fit:720/0*WBuiuf_DdU96xy7j 720w, https://miro.medium.com/v2/resize:fit:750/0*WBuiuf_DdU96xy7j 750w, https://miro.medium.com/v2/resize:fit:786/0*WBuiuf_DdU96xy7j 786w, https://miro.medium.com/v2/resize:fit:828/0*WBuiuf_DdU96xy7j 828w, https://miro.medium.com/v2/resize:fit:1100/0*WBuiuf_DdU96xy7j 1100w, https://miro.medium.com/v2/resize:fit:1400/0*WBuiuf_DdU96xy7j 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="4685" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">The machine learning system increased the recall of booked listings (i.e., how often the system retrieved a listing that was eventually booked) by 7.12% and reduced the size of the retrieval map area by 40.83%. It had a cumulative impact of +1.8% in uncancelled bookers on the platform. The initial model was evaluated against the baseline and each subsequent model iteration was evaluated against the preceding outgoing model.</em></p><p id="e941" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">Figures below demonstrate how search results for a specific street in Lima, Peru improved dramatically with the model by surfacing results that are much closer to the searched street.</em></p><p id="23ab" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">Before</em></p><figure class="pn po pp pq pr ne mw mx paragraph-image"><div class="mw mx qa"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*WY6zPazfA0a_i0hA 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*WY6zPazfA0a_i0hA 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*WY6zPazfA0a_i0hA 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*WY6zPazfA0a_i0hA 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*WY6zPazfA0a_i0hA 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*WY6zPazfA0a_i0hA 1100w, https://miro.medium.com/v2/resize:fit:1024/format:webp/0*WY6zPazfA0a_i0hA 1024w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 512px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*WY6zPazfA0a_i0hA 640w, https://miro.medium.com/v2/resize:fit:720/0*WY6zPazfA0a_i0hA 720w, https://miro.medium.com/v2/resize:fit:750/0*WY6zPazfA0a_i0hA 750w, https://miro.medium.com/v2/resize:fit:786/0*WY6zPazfA0a_i0hA 786w, https://miro.medium.com/v2/resize:fit:828/0*WY6zPazfA0a_i0hA 828w, https://miro.medium.com/v2/resize:fit:1100/0*WY6zPazfA0a_i0hA 1100w, https://miro.medium.com/v2/resize:fit:1024/0*WY6zPazfA0a_i0hA 1024w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 512px" /></picture></div></figure><p id="227f" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">After</em></p><figure class="pn po pp pq pr ne mw mx paragraph-image"><div class="mw mx qa"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*HRapNwJYpMmS3pUr 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*HRapNwJYpMmS3pUr 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*HRapNwJYpMmS3pUr 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*HRapNwJYpMmS3pUr 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*HRapNwJYpMmS3pUr 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*HRapNwJYpMmS3pUr 1100w, https://miro.medium.com/v2/resize:fit:1024/format:webp/0*HRapNwJYpMmS3pUr 1024w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 512px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*HRapNwJYpMmS3pUr 640w, https://miro.medium.com/v2/resize:fit:720/0*HRapNwJYpMmS3pUr 720w, https://miro.medium.com/v2/resize:fit:750/0*HRapNwJYpMmS3pUr 750w, https://miro.medium.com/v2/resize:fit:786/0*HRapNwJYpMmS3pUr 786w, https://miro.medium.com/v2/resize:fit:828/0*HRapNwJYpMmS3pUr 828w, https://miro.medium.com/v2/resize:fit:1100/0*HRapNwJYpMmS3pUr 1100w, https://miro.medium.com/v2/resize:fit:1024/0*HRapNwJYpMmS3pUr 1024w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 512px" /></picture></div></figure><h1 id="32f5" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">Exploring New Frontiers with Reinforcement Learning</h1><p id="20a8" class="pw-post-body-paragraph nk nl gu nm b nn ph np nq nr pi nt nu nv pj nx ny nz pk ob oc od pl of og oh gn bk"><em class="pm">While machine learning improved the system’s ability to differentiate search results, there was still room for improvement, particularly in learning whether locations that had never been surfaced before were relevant to guests for a search. To address this, Airbnb introduced reinforcement learning to the location retrieval process.</em></p><p id="6451" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">Reinforcement learning allowed the system to continuously learn from guest interactions by surfacing new areas for a given destination and adjusting the retrieval map area based on guest booking behavior. This approach, known as a contextual multi-armed bandit problem, involved balancing exploration (surfacing new locations) with exploitation (surfacing previous successful locations). The system could actively experiment with different retrieval map areas learning from guest bookings to refine its predictions.</em></p><p id="a30a" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">Applying a contextual multi-armed bandit traditionally requires defining an active contextual estimator, a method for uncertainty estimation, and an exploration strategy. We took the following approach given product constraints, system constraints, and the nature of our model formulation. The architecture is depicted in the figure below.</em></p><ol class=""><li id="a93c" class="nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh ps pt pu bk"><em class="pm">Active contextual estimation: We employed our existing machine learning model for location retrieval retrained on a daily basis to regularly learn from any new bookings data that we collect while surfacing previously unshown locations.</em></li><li id="2f1f" class="nk nl gu nm b nn pv np nq nr pw nt nu nv px nx ny nz py ob oc od pz of og oh ps pt pu bk"><em class="pm">Uncertainty estimation: We modified our model architecture with a random dropout layer to generate 32 unique predictions for a given search </em><a class="af oi" href="https://arxiv.org/abs/1506.02142" rel="noopener ugc nofollow" target="_blank"><em class="pm">(Monte Carlo Dropout</em></a><em class="pm">). This allows us to measure the mean and standard deviation of our prediction while minimizing negative impact to system performance and changes to our existing model formulation.</em></li><li id="6592" class="nk nl gu nm b nn pv np nq nr pw nt nu nv px nx ny nz py ob oc od pz of og oh ps pt pu bk"><em class="pm">Exploration Strategy: We compute an </em><a class="af oi" href="https://www.sciencedirect.com/science/article/pii/0196885885900028/pdf?md5=5e944497404774c469271b5074a677a8&amp;pid=1-s2.0-0196885885900028-main.pdf" rel="noopener ugc nofollow" target="_blank"><em class="pm">upper confidence bound</em></a><em class="pm"> using the mean and standard deviation of our prediction in order to construct larger retrieval map areas based on the model’s confidence in its prediction for the search.</em></li></ol><figure class="pn po pp pq pr ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx my"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*IdMA6OR9Tl2TOQRA 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*IdMA6OR9Tl2TOQRA 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*IdMA6OR9Tl2TOQRA 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*IdMA6OR9Tl2TOQRA 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*IdMA6OR9Tl2TOQRA 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*IdMA6OR9Tl2TOQRA 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*IdMA6OR9Tl2TOQRA 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*IdMA6OR9Tl2TOQRA 640w, https://miro.medium.com/v2/resize:fit:720/0*IdMA6OR9Tl2TOQRA 720w, https://miro.medium.com/v2/resize:fit:750/0*IdMA6OR9Tl2TOQRA 750w, https://miro.medium.com/v2/resize:fit:786/0*IdMA6OR9Tl2TOQRA 786w, https://miro.medium.com/v2/resize:fit:828/0*IdMA6OR9Tl2TOQRA 828w, https://miro.medium.com/v2/resize:fit:1100/0*IdMA6OR9Tl2TOQRA 1100w, https://miro.medium.com/v2/resize:fit:1400/0*IdMA6OR9Tl2TOQRA 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="38e0" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">This system successfully explored more for less-traveled locations where it was less confident and explored less for locations that are often searched and booked. For example, pictured below are the mean (inner) and upper confidence bound (outer) estimates of retrieval map areas for San Francisco, CA (left) and Smith Mountain Lake, Virginia (right). San Francisco is searched almost 25x more than Smith Mountain Lake with proportionately more bookings as well. As a result, the model is more confident in its retrieval map area estimate for San Francisco vs Smith Mountain Lake resulting in 2–3x less exploration for San Francisco queries vs Smith Mountain Lake.</em></p><figure class="pn po pp pq pr ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx qb"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*MVav43Mv8hV0QdCP 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*MVav43Mv8hV0QdCP 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*MVav43Mv8hV0QdCP 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*MVav43Mv8hV0QdCP 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*MVav43Mv8hV0QdCP 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*MVav43Mv8hV0QdCP 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*MVav43Mv8hV0QdCP 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*MVav43Mv8hV0QdCP 640w, https://miro.medium.com/v2/resize:fit:720/0*MVav43Mv8hV0QdCP 720w, https://miro.medium.com/v2/resize:fit:750/0*MVav43Mv8hV0QdCP 750w, https://miro.medium.com/v2/resize:fit:786/0*MVav43Mv8hV0QdCP 786w, https://miro.medium.com/v2/resize:fit:828/0*MVav43Mv8hV0QdCP 828w, https://miro.medium.com/v2/resize:fit:1100/0*MVav43Mv8hV0QdCP 1100w, https://miro.medium.com/v2/resize:fit:1400/0*MVav43Mv8hV0QdCP 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><figure class="pn po pp pq pr ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx qc"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*Yhqm1eXrI0iV1LQG 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*Yhqm1eXrI0iV1LQG 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*Yhqm1eXrI0iV1LQG 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*Yhqm1eXrI0iV1LQG 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*Yhqm1eXrI0iV1LQG 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*Yhqm1eXrI0iV1LQG 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*Yhqm1eXrI0iV1LQG 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*Yhqm1eXrI0iV1LQG 640w, https://miro.medium.com/v2/resize:fit:720/0*Yhqm1eXrI0iV1LQG 720w, https://miro.medium.com/v2/resize:fit:750/0*Yhqm1eXrI0iV1LQG 750w, https://miro.medium.com/v2/resize:fit:786/0*Yhqm1eXrI0iV1LQG 786w, https://miro.medium.com/v2/resize:fit:828/0*Yhqm1eXrI0iV1LQG 828w, https://miro.medium.com/v2/resize:fit:1100/0*Yhqm1eXrI0iV1LQG 1100w, https://miro.medium.com/v2/resize:fit:1400/0*Yhqm1eXrI0iV1LQG 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="e5bd" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">The reinforcement learning system was also tested against the outgoing machine learning model in online A/B experiments showing a cumulative 0.51% increase in uncanceled bookers and 0.71% increase in 5 star trip rate over two iterations that introduced reinforcement learning and optimized scoring of the more complex model.</em></p><h1 id="3fed" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">Conclusion: A Transformative Journey</h1><p id="206e" class="pw-post-body-paragraph nk nl gu nm b nn ph np nq nr pi nt nu nv pj nx ny nz pk ob oc od pl of og oh gn bk"><em class="pm">Airbnb’s journey from simple heuristics to sophisticated machine learning and reinforcement learning models demonstrates the power of data-driven approaches in transforming complex systems. By continually iterating and improving its location retrieval process, Airbnb has not only enhanced the relevance of its search results but also helped guests experience more 5 star trips.</em></p><p id="6751" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">This transformation cumulatively results in a 2.66% increase in uncanceled bookers — a major achievement for a company operating at Airbnb’s scale. More details can be found in </em><a class="af oi" href="https://arxiv.org/abs/2408.13399" rel="noopener ugc nofollow" target="_blank"><em class="pm">our technical paper</em></a><em class="pm">. As Airbnb continues to innovate, we are continuously evaluating and introducing more advanced features and retrieval mechanisms like retrieving with complex polygons . These will further refine and enhance the search experience for millions of guests worldwide.</em></p><p id="e269" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="pm">If this type of work interests you, check out some of our related positions and more at </em><a class="af oi" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"><em class="pm">Careers at Airbnb</em></a><em class="pm">!</em></p><h1 id="935a" class="oj ok gu bf ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg bk">****************</h1><p id="f64b" class="pw-post-body-paragraph nk nl gu nm b nn ph np nq nr pi nt nu nv pj nx ny nz pk ob oc od pl of og oh gn bk"><em class="pm">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/transforming-location-retrieval-at-airbnb-a-journey-from-heuristics-to-reinforcement-learning-d33ffc4ddb8f</link>
      <guid>https://medium.com/airbnb-engineering/transforming-location-retrieval-at-airbnb-a-journey-from-heuristics-to-reinforcement-learning-d33ffc4ddb8f</guid>
      <pubDate>Mon, 11 Nov 2024 19:14:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Automation Platform v2: Improving Conversational AI at Airbnb]]></title>
      <description><![CDATA[<div><div><h2 id="3e82" class="pw-subtitle-paragraph hr gt gu bf b hs ht hu hv hw hx hy hz ia ib ic id ie if ig cq du"><strong class="al">How Airbnb’s conversational AI platform powers LLM application development.</strong></h2><div></div><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj nk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*36lUKfHUjs_YMo8DMj0huQ.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="34d4" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">By <a class="af os" href="https://www.linkedin.com/in/chutianwang/" rel="noopener ugc nofollow" target="_blank">Chutian Wang</a>, <a class="af os" href="https://www.linkedin.com/in/zhiheng-xu-50249b31/" rel="noopener ugc nofollow" target="_blank">Zhiheng Xu</a>, <a class="af os" href="https://www.linkedin.com/in/paullou-sea/" rel="noopener ugc nofollow" target="_blank">Paul Lou</a>, <a class="af os" href="https://www.linkedin.com/in/ziyi-wang-6651b5b1/" rel="noopener ugc nofollow" target="_blank">Ziyi Wang</a>, <a class="af os" href="https://www.linkedin.com/in/jiayu-lou-337ba785/" rel="noopener ugc nofollow" target="_blank">Jiayu Lou</a>, <a class="af os" href="https://www.linkedin.com/in/liuming-zhang-4b120894/" rel="noopener ugc nofollow" target="_blank">Liuming Zhang</a>, <a class="af os" href="https://www.linkedin.com/in/jingwen-qiang-76aba382/" rel="noopener ugc nofollow" target="_blank">Jingwen Qiang</a>, <a class="af os" href="https://www.linkedin.com/in/clintonkelly/" rel="noopener ugc nofollow" target="_blank">Clint Kelly</a>, <a class="af os" href="https://www.linkedin.com/in/mleoshi/" rel="noopener ugc nofollow" target="_blank">Lei Shi</a>, <a class="af os" href="https://www.linkedin.com/in/dan-zhao-560460143/" rel="noopener ugc nofollow" target="_blank">Dan Zhao</a>, <a class="af os" href="https://www.linkedin.com/in/huxiaoxu/" rel="noopener ugc nofollow" target="_blank">Xu Hu</a>, <a class="af os" href="https://www.linkedin.com/in/jianqi-liao-84b32510a/" rel="noopener ugc nofollow" target="_blank">Jianqi Liao</a>, <a class="af os" href="https://www.linkedin.com/in/zecheng-xu-11bb778a/" rel="noopener ugc nofollow" target="_blank">Zecheng Xu</a>, <a class="af os" href="https://www.linkedin.com/in/tong-chen-3a5b1519/" rel="noopener ugc nofollow" target="_blank">Tong Chen</a></p><h1 id="62e3" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Introduction</h1><p id="be9c" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Artificial intelligence and large language models (LLMs) are a rapidly evolving sector at the forefront of technological innovation. AI’s capacity for logical reasoning and task completion is changing the way we interact with technology.</p><p id="fa72" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">In this blog post, we will showcase how we advanced Automation Platform, Airbnb’s conversational AI platform, from version 1, which supported conversational systems driven by static workflows, to version 2, which is designed specifically for emerging LLM applications. Now, developers can build LLM applications that help customer support agents work more efficiently, provide better resolutions, and quicker responses. LLM application architecture is a rapidly evolving domain and this blog post provides an overview of our efforts to adopt state-of-the-art LLM architecture to keep enhancing our platform based on the latest developments in the field.</p><h1 id="039d" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Overview of Automation Platform</h1><p id="e438" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">In a previous <a class="af os" rel="noopener" href="https://medium.com/airbnb-engineering/intelligent-automation-platform-empowering-conversational-ai-and-beyond-at-airbnb-869c44833ff2">blog post</a>, we introduced Automation Platform v1, an enterprise-level platform developed by Airbnb to support a suite of conversational AI products.</p><p id="c5e8" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Automation Platform v1 modeled traditional conversational AI products (e.g., chatbots) into predefined step-by-step workflows that could be designed and managed by product engineering and business teams.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*7AX29Y1VoSE9bPvq 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*7AX29Y1VoSE9bPvq 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*7AX29Y1VoSE9bPvq 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*7AX29Y1VoSE9bPvq 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*7AX29Y1VoSE9bPvq 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*7AX29Y1VoSE9bPvq 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*7AX29Y1VoSE9bPvq 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*7AX29Y1VoSE9bPvq 640w, https://miro.medium.com/v2/resize:fit:720/0*7AX29Y1VoSE9bPvq 720w, https://miro.medium.com/v2/resize:fit:750/0*7AX29Y1VoSE9bPvq 750w, https://miro.medium.com/v2/resize:fit:786/0*7AX29Y1VoSE9bPvq 786w, https://miro.medium.com/v2/resize:fit:828/0*7AX29Y1VoSE9bPvq 828w, https://miro.medium.com/v2/resize:fit:1100/0*7AX29Y1VoSE9bPvq 1100w, https://miro.medium.com/v2/resize:fit:1400/0*7AX29Y1VoSE9bPvq 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 1. Automation Platform v1 architecture.</figcaption></figure><h1 id="9000" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Challenges of Traditional Conversational AI Systems</h1><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pz"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*2e3gA5cZRWaZoblv 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*2e3gA5cZRWaZoblv 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*2e3gA5cZRWaZoblv 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*2e3gA5cZRWaZoblv 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*2e3gA5cZRWaZoblv 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*2e3gA5cZRWaZoblv 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*2e3gA5cZRWaZoblv 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*2e3gA5cZRWaZoblv 640w, https://miro.medium.com/v2/resize:fit:720/0*2e3gA5cZRWaZoblv 720w, https://miro.medium.com/v2/resize:fit:750/0*2e3gA5cZRWaZoblv 750w, https://miro.medium.com/v2/resize:fit:786/0*2e3gA5cZRWaZoblv 786w, https://miro.medium.com/v2/resize:fit:828/0*2e3gA5cZRWaZoblv 828w, https://miro.medium.com/v2/resize:fit:1100/0*2e3gA5cZRWaZoblv 1100w, https://miro.medium.com/v2/resize:fit:1400/0*2e3gA5cZRWaZoblv 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 2. Typical workflow that is supported by v1 of Automation Platform.</figcaption></figure><p id="bfc4" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We saw several challenges when implementing Automation Platform v1, which may also be broadly applicable to typical conversational products:</p><ol class=""><li id="e0da" class="nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or qa qb qc bk">Not flexible enough: the AI products are following a predefined (and usually rigid) process.</li><li id="b01a" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qa qb qc bk">Hard to scale: product creators need to manually create workflows and tasks for every scenario, and repeat the process for any new use case later, which is time-consuming and error prone.</li></ol><h1 id="cc6e" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Opportunities of Conversational AI Driven by LLM</h1><p id="0ef8" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Our early experiments showed that LLM-powered conversation can provide a more natural and intelligent conversational experience than our current human-designed workflows. For example, with a LLM-powered chatbot, customers can engage in a natural dialogue experience asking open-ended questions and explaining their issues in detail. LLM can more accurately interpret customer queries, even capturing nuanced information from the ongoing conversation.</p><p id="a037" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">However, LLM-powered applications are still relatively new, and the community is improving some of its aspects to meet production level requirements, like latency or hallucination.So it is too early to fully rely on them for large scale and diverse experience for millions of customers at Airbnb. For instance, it’s more suitable to use a transition workflow instead of LLM to process a claim related product that requires sensitive data and numbers of strict validations.</p><p id="52b6" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We believe that at this moment, the best strategy is to combine them with traditional workflows and leverage the benefits of both approaches.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*bFT0r4T054R24pAm 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*bFT0r4T054R24pAm 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*bFT0r4T054R24pAm 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*bFT0r4T054R24pAm 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*bFT0r4T054R24pAm 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*bFT0r4T054R24pAm 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*bFT0r4T054R24pAm 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*bFT0r4T054R24pAm 640w, https://miro.medium.com/v2/resize:fit:720/0*bFT0r4T054R24pAm 720w, https://miro.medium.com/v2/resize:fit:750/0*bFT0r4T054R24pAm 750w, https://miro.medium.com/v2/resize:fit:786/0*bFT0r4T054R24pAm 786w, https://miro.medium.com/v2/resize:fit:828/0*bFT0r4T054R24pAm 828w, https://miro.medium.com/v2/resize:fit:1100/0*bFT0r4T054R24pAm 1100w, https://miro.medium.com/v2/resize:fit:1400/0*bFT0r4T054R24pAm 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 3. Comparison of traditional workflows and AI driven workflows</figcaption></figure><h1 id="f1c2" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Architecture of LLM Application on Automation Platform v2</h1><p id="3570" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Figure 4 shows a high level overview of how Automation Platform v2 powers LLM applications.</p><p id="0534" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Here is an example of a customer asking our LLM chatbot “where is my next reservation?”</p><ul class=""><li id="9b50" class="nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or qi qb qc bk">Firstly, user inquiry arrives at our platform. Based on the inquiry, our platform collects relevant contextual information, such as previous chat history, user id, user role, etc.</li><li id="edd9" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qi qb qc bk">After that, our platform loads and assembles the prompt using inquiry and context, then sends it to LLM.</li><li id="a926" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qi qb qc bk">In this example, the first LLM response will be requesting a tool execution that makes a service call to fetch the most recent reservation of the current user. Our platform follows this order and does the actual service call then saves call responses into the current context.</li><li id="71d8" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qi qb qc bk">Next, our platform sends the updated context to LLM and the second LLM response will be a complete sentence describing the location of the user’s next reservation.</li><li id="55e4" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qi qb qc bk">Lastly, our platform returns LLM response and records this round of conversion for future reference.</li></ul><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*fRn12cGHL3-CDpoJ 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*fRn12cGHL3-CDpoJ 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*fRn12cGHL3-CDpoJ 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*fRn12cGHL3-CDpoJ 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*fRn12cGHL3-CDpoJ 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*fRn12cGHL3-CDpoJ 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*fRn12cGHL3-CDpoJ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*fRn12cGHL3-CDpoJ 640w, https://miro.medium.com/v2/resize:fit:720/0*fRn12cGHL3-CDpoJ 720w, https://miro.medium.com/v2/resize:fit:750/0*fRn12cGHL3-CDpoJ 750w, https://miro.medium.com/v2/resize:fit:786/0*fRn12cGHL3-CDpoJ 786w, https://miro.medium.com/v2/resize:fit:828/0*fRn12cGHL3-CDpoJ 828w, https://miro.medium.com/v2/resize:fit:1100/0*fRn12cGHL3-CDpoJ 1100w, https://miro.medium.com/v2/resize:fit:1400/0*fRn12cGHL3-CDpoJ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 4. Overview of how Automation Platform v2 powers LLM application</figcaption></figure><p id="7d79" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Another important area we support is developers of LLM applications. There are several integrations between our system and developer tools to make the development process seamless. Also, we offer a number of tools like context management, guardrails, playground and insights.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*m_yYLeUdYy706yIr 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*m_yYLeUdYy706yIr 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*m_yYLeUdYy706yIr 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*m_yYLeUdYy706yIr 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*m_yYLeUdYy706yIr 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*m_yYLeUdYy706yIr 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*m_yYLeUdYy706yIr 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*m_yYLeUdYy706yIr 640w, https://miro.medium.com/v2/resize:fit:720/0*m_yYLeUdYy706yIr 720w, https://miro.medium.com/v2/resize:fit:750/0*m_yYLeUdYy706yIr 750w, https://miro.medium.com/v2/resize:fit:786/0*m_yYLeUdYy706yIr 786w, https://miro.medium.com/v2/resize:fit:828/0*m_yYLeUdYy706yIr 828w, https://miro.medium.com/v2/resize:fit:1100/0*m_yYLeUdYy706yIr 1100w, https://miro.medium.com/v2/resize:fit:1400/0*m_yYLeUdYy706yIr 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 5. Overview of how Automation Platform v2 powers LLM developers</figcaption></figure><p id="4d85" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">In the following subsections, we will deep dive into a few key areas on supporting LLM applications including: LLM workflows, context management and guardrails.</p><p id="1744" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">While we won’t cover all aspects in detail in this post, we have also built other components to facilitate LLM practice at Airbnb including:</p><ul class=""><li id="6419" class="nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or qi qb qc bk">Playground feature to bridge the gap between development and production tech stacks by allowing prompt writers to freely iterate on their prompts.</li><li id="2c70" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qi qb qc bk">LLM-oriented observability with detailed insights into each LLM interaction, like latency and token usage.</li><li id="c575" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qi qb qc bk">Enhancement to Tool management that is responsible for tools registration, the publishing process, execution and observability.</li></ul><h1 id="c677" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Chain of Thought Workflow</h1><p id="a9db" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk"><a class="af os" href="https://arxiv.org/pdf/2201.11903.pdf" rel="noopener ugc nofollow" target="_blank">Chain of Thought</a> is one of AI agent frameworks that enables LLMs to reason about issues.</p><p id="d3ec" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We implemented the concept of Chain of Thought in the form of a workflow on Automation Platform v2 as shown below. The core idea of Chain of Thought is to use an LLM as the reasoning engine to determine which tools to use and in which order. Tools are the way an LLM interacts with the world to solve real problems, for example checking a reservation’s status or checking listing availability.</p><p id="f441" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Tools are essentially actions and workflows, the basic building blocks of traditional products in Automation Platform v1. Actions and workflows work well as tools in Chain of Thought because of their unified interface and managed execution environment.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*7GQyQDJA8KxkmqT4 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*7GQyQDJA8KxkmqT4 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*7GQyQDJA8KxkmqT4 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*7GQyQDJA8KxkmqT4 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*7GQyQDJA8KxkmqT4 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*7GQyQDJA8KxkmqT4 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*7GQyQDJA8KxkmqT4 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*7GQyQDJA8KxkmqT4 640w, https://miro.medium.com/v2/resize:fit:720/0*7GQyQDJA8KxkmqT4 720w, https://miro.medium.com/v2/resize:fit:750/0*7GQyQDJA8KxkmqT4 750w, https://miro.medium.com/v2/resize:fit:786/0*7GQyQDJA8KxkmqT4 786w, https://miro.medium.com/v2/resize:fit:828/0*7GQyQDJA8KxkmqT4 828w, https://miro.medium.com/v2/resize:fit:1100/0*7GQyQDJA8KxkmqT4 1100w, https://miro.medium.com/v2/resize:fit:1400/0*7GQyQDJA8KxkmqT4 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 6. Overview of Chain of Thought workflow</figcaption></figure><p id="ec2a" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Figure 6 contains the main steps of the Chain of Thought workflow. It starts with preparing context for the LLM, including prompt, contextual data, and historical conversations. Then it triggers the logic reasoning loop: asking the LLM for reasoning, executing the LLM-requested tool and processing the tool’s outcome. Chain of Thought will stay in the reasoning loop until a result is generated.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*xBuAocmzlk2IMVaU 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*xBuAocmzlk2IMVaU 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*xBuAocmzlk2IMVaU 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*xBuAocmzlk2IMVaU 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*xBuAocmzlk2IMVaU 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*xBuAocmzlk2IMVaU 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*xBuAocmzlk2IMVaU 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*xBuAocmzlk2IMVaU 640w, https://miro.medium.com/v2/resize:fit:720/0*xBuAocmzlk2IMVaU 720w, https://miro.medium.com/v2/resize:fit:750/0*xBuAocmzlk2IMVaU 750w, https://miro.medium.com/v2/resize:fit:786/0*xBuAocmzlk2IMVaU 786w, https://miro.medium.com/v2/resize:fit:828/0*xBuAocmzlk2IMVaU 828w, https://miro.medium.com/v2/resize:fit:1100/0*xBuAocmzlk2IMVaU 1100w, https://miro.medium.com/v2/resize:fit:1400/0*xBuAocmzlk2IMVaU 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 7. High level components powering Chain of Thought in Automation Platform</figcaption></figure><p id="b4c5" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Figure 7 shows all high-level components powering Chain of Thought:</p><ol class=""><li id="30e0" class="nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or qa qb qc bk">CoT (Chain of Thought) IO handler: assemble the prompt, prepare contextual data, collect user input and general data processing before sending it to the LLM.</li><li id="87cd" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qa qb qc bk">Tool Manager: prepare tool payload with LLM input &amp; output, manage tool execution and offer quality of life features like retry or rate limiting.</li><li id="e13b" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qa qb qc bk">LLM Adapter: allow developers to add customized logic facilitating integration with different types of LLMs.</li></ol><h1 id="3e39" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Context Management</h1><p id="f4d8" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">To ensure the LLM makes the best decision, we need to provide all necessary and relevant information to the LLM such as historical interactions with the LLM, the intent of the customer support inquiry, current trip information and more. For use cases like offline evaluation, point-in-time data retrieval is also supported by our system via configuration.</p><p id="8dae" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Given the large amount of available contextual information, developers are allowed to either statically declare the needed context (e.g. customer name) or name a dynamic context retriever (e.g. relevant help articles of customer’s questions ).</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*RJ35yu1HEAwdLh0l 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*RJ35yu1HEAwdLh0l 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*RJ35yu1HEAwdLh0l 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*RJ35yu1HEAwdLh0l 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*RJ35yu1HEAwdLh0l 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*RJ35yu1HEAwdLh0l 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*RJ35yu1HEAwdLh0l 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*RJ35yu1HEAwdLh0l 640w, https://miro.medium.com/v2/resize:fit:720/0*RJ35yu1HEAwdLh0l 720w, https://miro.medium.com/v2/resize:fit:750/0*RJ35yu1HEAwdLh0l 750w, https://miro.medium.com/v2/resize:fit:786/0*RJ35yu1HEAwdLh0l 786w, https://miro.medium.com/v2/resize:fit:828/0*RJ35yu1HEAwdLh0l 828w, https://miro.medium.com/v2/resize:fit:1100/0*RJ35yu1HEAwdLh0l 1100w, https://miro.medium.com/v2/resize:fit:1400/0*RJ35yu1HEAwdLh0l 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 8. Overall architecture of context management in Automation Platform v2</figcaption></figure><p id="70ff" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Context Management is the key component ensuring the LLM has the access to all necessary contextual information. Figure 8 shows major Context Management components:</p><ol class=""><li id="7de3" class="nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or qa qb qc bk">Context Loader: connect to different sources and fetch relevant context based on developers’ customizable fetching logic.</li><li id="4ead" class="nw nx gu ny b hs qd oa ob hv qe od oe of qf oh oi oj qg ol om on qh op oq or qa qb qc bk">Runtime Context Manager: maintain runtime context, process context for each LLM call and interact with context storage.</li></ol><h1 id="a0bf" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Guardrails Framework</h1><p id="01df" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">LLMs are powerful text generation tools, but they also can come with issues like <a class="af os" href="https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)" rel="noopener ugc nofollow" target="_blank">hallucinations</a> and <a class="af os" href="https://en.wikipedia.org/wiki/Prompt_injection#Types" rel="noopener ugc nofollow" target="_blank">jailbreaks</a>. This is where our Guardrails Framework comes in, a safe-guarding mechanism that monitors communications with the LLM, ensuring it is helpful, relevant and ethical.</p><figure class="nl nm nn no np nq ni nj paragraph-image"><div role="button" tabindex="0" class="nr ns fj nt bh nu"><div class="ni nj pu"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*FOWFHxZdjI2jWoMm 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*FOWFHxZdjI2jWoMm 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*FOWFHxZdjI2jWoMm 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*FOWFHxZdjI2jWoMm 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*FOWFHxZdjI2jWoMm 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*FOWFHxZdjI2jWoMm 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*FOWFHxZdjI2jWoMm 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*FOWFHxZdjI2jWoMm 640w, https://miro.medium.com/v2/resize:fit:720/0*FOWFHxZdjI2jWoMm 720w, https://miro.medium.com/v2/resize:fit:750/0*FOWFHxZdjI2jWoMm 750w, https://miro.medium.com/v2/resize:fit:786/0*FOWFHxZdjI2jWoMm 786w, https://miro.medium.com/v2/resize:fit:828/0*FOWFHxZdjI2jWoMm 828w, https://miro.medium.com/v2/resize:fit:1100/0*FOWFHxZdjI2jWoMm 1100w, https://miro.medium.com/v2/resize:fit:1400/0*FOWFHxZdjI2jWoMm 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv ff pw ni nj px py bf b bg z du">Figure 9. Guardrails Framework architecture</figcaption></figure><p id="4fe7" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Figure 9 shows the architecture of Guardrails Framework where engineers from different teams create reusable guardrails. During runtime, guardrails can be executed in parallel and leverage different downstream tech stacks. For example, the content moderation guardrail calls various LLMs to detect violations in communication content, and tool guardrails use rules to prevent bad execution, for example updating listings with invalid setup.</p><h1 id="cdec" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">What’s Next</h1><p id="18ca" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">In this blog, we presented the most recent evolution of Automation Platform, the conversational AI platform at Airbnb, to power emerging LLM applications.</p><p id="1871" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">LLM application is a rapidly developing domain, and we will continue to evolve with these transformative technologies, explore other <a class="af os" href="https://arxiv.org/abs/2305.10601" rel="noopener ugc nofollow" target="_blank">AI agent frameworks</a>, expand Chain of Thought tool capabilities and investigate LLM application simulation. We anticipate further efficiency and productivity gains for all AI practitioners at Airbnb with these innovations.</p><p id="5a2d" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">We’re hiring! If work like this interests you check out our <a class="af os" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">careers site</a>.</p><h1 id="5dc1" class="ot ou gu bf ov ow ox hu oy oz pa hx pb pc pd pe pf pg ph pi pj pk pl pm pn po bk">Acknowledgements</h1><p id="790d" class="pw-post-body-paragraph nw nx gu ny b hs pp oa ob hv pq od oe of pr oh oi oj ps ol om on pt op oq or gn bk">Thanks to Mia Zhao, Zay Guan, Michael Lubavin, Wei Wu, Yashar Mehdad, Julian Warszawski, Ting Luo, Junlan Li, Wayne Zhang, Zhenyu Zhao, Yuanpei Cao, Yisha Wu, Peng Wang, Heng Ji, Tiantian Zhang, Cindy Chen, Hanchen Su, Wei Han, Mingzhi Xu, Ying Lyu, Elaine Liu, Hengyu Zhou, Teng Wang, Shawn Yan, Zecheng Xu, Haiyu Zhang, Gary Pan, Tong Chen, Pei-Fen Tu, Ying Tan, Fengyang Chen, Haoran Zhu, Xirui Liu, Tony Jiang, Xiao Zeng, Wei Wu, Tongyun Lv, Zixuan Yang, Keyao Yang, Danny Deng, Xiang Lan and Wei Ji for the product collaborations.</p><p id="83a5" class="pw-post-body-paragraph nw nx gu ny b hs nz oa ob hv oc od oe of og oh oi oj ok ol om on oo op oq or gn bk">Thanks to Joy Zhang, Raj Rajagopal, Tina Su, Peter Frank, Shuohao Zhang, Jack Song, Navjot Sidhu, Weiping Peng, Andy Yasutake and Hanlin Fang’s leadership support for the Intelligent Automation Platform.</p></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/automation-platform-v2-improving-conversational-ai-at-airbnb-d86c9386e0cb</link>
      <guid>https://medium.com/airbnb-engineering/automation-platform-v2-improving-conversational-ai-at-airbnb-d86c9386e0cb</guid>
      <pubDate>Mon, 28 Oct 2024 18:02:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Sandcastle: data/AI apps for everyone]]></title>
      <description><![CDATA[<div><div></div><p id="13a8" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Airbnb made it easy to bring data/AI ideas to life through a platform for prototyping web applications.</p><p id="1d5d" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">By:</strong> Dan Miller</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv nw"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*eGkAsMkZXIEKQGhLiQyCTw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="oi ff oj nu nv ok ol bf b bg z du">Warm, friendly beach capturing the playful nature of prototyping.</figcaption></figure><h1 id="fe43" class="om on gu bf oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Introduction</h1><p id="327a" class="pw-post-body-paragraph mw mx gu my b mz pk nb nc nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt gn bk"><a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/data-quality-at-airbnb-870d03080469">Trustworthy data</a> has always been a part of Airbnb’s technical DNA. However, it is challenging for our data scientists and ML practitioners to bring data- and AI-powered product ideas to life in a way that resonates with our <a class="af pp" href="https://news.airbnb.com/designing-the-future-of-airbnb/" rel="noopener ugc nofollow" target="_blank">design-focused leadership</a>. Slide decks with screenshots, design documents with plots, and even Figmas are insufficient to capture ideas that need to be experienced in order to be understood. This was especially true as large language models (LLMs) took the world by storm, since they are typically used interactively in chat interfaces.</p><p id="ee8a" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In this blog post, we’ll focus on Sandcastle, an Airbnb-internal prototyping platform that enables data scientists, engineers, and even product managers to bring data/AI ideas to life as internal web applications for our design and product teams. Through Sandcastle, hundreds of individuals can be “cereal entrepreneurs” — empowered to directly iterate on and share their ideas. We’ll talk through common industry challenges involved in sharing web applications internally, give an overview of how Airbnb solved these challenges by building on top of its existing cloud infrastructure, and showcase the scale of our results.</p><h1 id="578d" class="om on gu bf oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Challenges</h1><p id="8b97" class="pw-post-body-paragraph mw mx gu my b mz pk nb nc nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt gn bk">Imagine a data scientist is working on a <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/airbnb-at-kdd-2023-9084ad244d8c">typical data science problem at Airbnb</a>: optimizing the positive milestones guests reach along their user journey, visualizing that journey, or improving explainability and statistical power in mathematically challenging scenarios like <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/artificial-counterfactual-estimation-ace-machine-learning-based-causal-inference-at-airbnb-ee32ee4d0512">company-wide launches without A/B</a>, or <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/airbnb-brandometer-powering-brand-perception-measurement-on-social-media-data-with-ai-c83019408051">measuring brand perception</a>. The data scientist has a brilliant LLM-powered idea. They want to demonstrate the capability their idea exposes in an interactive way, ideally one that can easily “go viral” with non-technical stakeholders. Standing between the idea and stakeholders are several challenges.</p><p id="6fbb" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Leadership &amp; non-technical stakeholders will not want to run a Jupyter notebook, but they can click around in a UI and try out different input assumptions, choose different techniques, and deep-dive into outputs.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv pq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*X04kxt44BStX2BbA 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*X04kxt44BStX2BbA 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*X04kxt44BStX2BbA 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*X04kxt44BStX2BbA 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*X04kxt44BStX2BbA 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*X04kxt44BStX2BbA 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*X04kxt44BStX2BbA 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*X04kxt44BStX2BbA 640w, https://miro.medium.com/v2/resize:fit:720/0*X04kxt44BStX2BbA 720w, https://miro.medium.com/v2/resize:fit:750/0*X04kxt44BStX2BbA 750w, https://miro.medium.com/v2/resize:fit:786/0*X04kxt44BStX2BbA 786w, https://miro.medium.com/v2/resize:fit:828/0*X04kxt44BStX2BbA 828w, https://miro.medium.com/v2/resize:fit:1100/0*X04kxt44BStX2BbA 1100w, https://miro.medium.com/v2/resize:fit:1400/0*X04kxt44BStX2BbA 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="oi ff oj nu nv ok ol bf b bg z du">Sandcastle app development</figcaption></figure><p id="c3a7" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Data scientists are most comfortable writing Python code, and are quite unfamiliar with the world of modern web development (TypeScript, React, etc.). <strong class="my gv">How can they capture their idea in an interactive application</strong>, even in their own development environment? Traditionally, this is done by collaborating with a frontend engineering team, but that brings its own set of challenges. Engineering bandwidth is typically limited, so prototyping new ideas must go through lengthy planning and prioritization cycles. Worse, it is nearly impossible for data scientists to iterate on the science behind their ideas, since any change must go through reprioritization and implementation.</p><p id="0735" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Suppose we can surmount the challenge of capturing an idea in a locally-run interactive web application. <strong class="my gv">How do we package and share it in a way that other data scientists can easily reproduce using standard infrastructure?</strong></p><p id="168a" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk"><strong class="my gv">How can a data science organization handle infrastructure</strong>, networking with other parts of Airbnb’s complex tech stack, authentication so their apps don’t leak sensitive data, and storage for any temporary or intermediate data. <strong class="my gv">How can they create easily shareable “handles” for their web applications</strong> that can easily go viral internally?</p><h2 id="23eb" class="pr on gu bf oo ps pt dy os pu pv ea ow nh pw px py nl pz qa qb np qc qd qe qf bk">Sandcastle</h2><p id="4031" class="pw-post-body-paragraph mw mx gu my b mz pk nb nc nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt gn bk">Airbnb’s solution to the challenges above is called <strong class="my gv">Sandcastle</strong>. It brings together <a class="af pp" href="https://wamlm-kdd.github.io/wamlm/2023.html" rel="noopener ugc nofollow" target="_blank">Onebrain</a>: Airbnb’s packaging framework for data science / prototyping code, <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/a-krispr-approach-to-kubernetes-infrastructure-a0741cff4e0c">kube-gen</a>: Airbnb’s infrastructure for generated Kubernetes configuration, and <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/dynamic-kubernetes-cluster-scaling-at-airbnb-d79ae3afa132">OneTouch</a>: Airbnb’s infrastructure layer for dynamically scaled Kubernetes clusters. Sandcastle is accessible for data scientists, software developers, and even product managers, whether their preferred language is Python, TypeScript, R, or something else. We have had team members use Sandcastle to go from “idea” to “live internal app” in less than an hour.</p><figure class="nx ny nz oa ob oc nu nv paragraph-image"><div role="button" tabindex="0" class="od oe fj of bh og"><div class="nu nv qg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*VFg02QsZn-_tisE4 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*VFg02QsZn-_tisE4 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*VFg02QsZn-_tisE4 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*VFg02QsZn-_tisE4 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*VFg02QsZn-_tisE4 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*VFg02QsZn-_tisE4 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*VFg02QsZn-_tisE4 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*VFg02QsZn-_tisE4 640w, https://miro.medium.com/v2/resize:fit:720/0*VFg02QsZn-_tisE4 720w, https://miro.medium.com/v2/resize:fit:750/0*VFg02QsZn-_tisE4 750w, https://miro.medium.com/v2/resize:fit:786/0*VFg02QsZn-_tisE4 786w, https://miro.medium.com/v2/resize:fit:828/0*VFg02QsZn-_tisE4 828w, https://miro.medium.com/v2/resize:fit:1100/0*VFg02QsZn-_tisE4 1100w, https://miro.medium.com/v2/resize:fit:1400/0*VFg02QsZn-_tisE4 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h2 id="6d09" class="pr on gu bf oo ps pt dy os pu pv ea ow nh pw px py nl pz qa qb np qc qd qe qf bk">Onebrain</h2><p id="2d0e" class="pw-post-body-paragraph mw mx gu my b mz pk nb nc nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt gn bk">The open source ecosystem solves our first challenge, interactivity. Frameworks like <a class="af pp" href="https://www.youtube.com/watch?v=X3rTUZOm2jA" rel="noopener ugc nofollow" target="_blank">Streamlit</a>, <a class="af pp" href="https://dash.plotly.com/" rel="noopener ugc nofollow" target="_blank">Dash</a>, and <a class="af pp" href="https://fastapi.tiangolo.com/" rel="noopener ugc nofollow" target="_blank">FastAPI</a>, make it a delight for non-frontend developers to get an application up and running in their own development environment. Onebrain solves the second challenge: how to package a working set of code in a reproducible manner. We <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/airbnb-at-kdd-2023-9084ad244d8c">presented on Onebrain in detail at KDD 2023</a> but include a brief summary here. Onebrain assumes you arrange your code in “projects”: collections of arbitrary source code around a onebrain.yml file which looks like below.</p><pre class="nx ny nz oa ob qh qi qj bp qk bb bk">name: youridea<br />version: 1.2.3<br />description: Example Sandcastle app<br />authors: ['Jane Doe &lt;jane.doe@airbnb.email&gt;']build_enabled: trueentry_points:<br />  main:<br />    type: shell<br />    command: streamlit run app.py --server.port {{port}}<br />    parameters:<br />      port: {type: int, default: 8880}env:<br />  python:<br />    pip: {streamlit: ==1.34.0}</pre><p id="a0bc" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">This “project file” includes metadata like name, version, authorship, along with a collection of command line entry points that may run shell scripts, Python code, etc. and an environment specification directing which Python and R packages are needed to run. A developer may run “<strong class="my gv">brain run”</strong> in the same directory as their project file for interactive development. Onebrain is integrated with Airbnb’s continuous integration, so every commit of the project will be published to our snapshot service. The snapshot service is a lightweight mechanism for storing immutable copies of source code that may be easily downloaded from anywhere else in Airbnb’s tech stack. Services may invoke</p><pre class="nx ny nz oa ob qh qi qj bp qk bb bk">brain run youridea --port 9877</pre><p id="1c2d" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">to resolve the latest snapshot of the project, bootstrap any dependencies, and invoke the parameterized shell command. This decouples rapid iteration on application logic with slower CI/CD against the service configuration we’ll talk about below.</p><h2 id="0051" class="pr on gu bf oo ps pt dy os pu pv ea ow nh pw px py nl pz qa qb np qc qd qe qf bk">kube-gen</h2><p id="efda" class="pw-post-body-paragraph mw mx gu my b mz pk nb nc nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt gn bk">Cloud infrastructure is challenging to configure correctly, especially for data scientists. Fortunately, Airbnb has <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/a-krispr-approach-to-kubernetes-infrastructure-a0741cff4e0c">built a code-generation layer on top of Kubernetes called kube-gen</a>, which handles most of authentication, tracing, and cross-service communication for you. Sandcastle further simplifies things by using kube-gen hooks to generate all but one service configuration file on the developer’s behalf during build. The kube-gen configuration for a typical application would include environment-specific service parameters, Kubernetes app + container configuration, <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/continuous-delivery-at-airbnb-6ac042bc7876">Spinnaker™ pipeline definitions</a>, and configuration for <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/improving-istio-propagation-delay-d4da9b5b9f90">Airbnb’s network proxy</a>. Sandcastle generates sensible defaults for all of that configuration on-the-fly, so that all an app developer needs to write is a simple container configuration file like below. Multiple developers have raised support threads because the configuration was so simple, they thought they were making a mistake!</p><pre class="nx ny nz oa ob qh qi qj bp qk bb bk">name: sandcastle-youridea<br />image: {{ .Env.Params.pythonImage }}command:<br />  - brain<br />  - download-and-run<br />  - youridea<br />  - --port<br />  - {{ .Env.Params.port }}resources: {{ ToInlineYaml .Env.Params.containerResources }}</pre><p id="88fc" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">The file above allows an app developer to configure which Onebrain project to run, which port it exposes a process on, and customize the underlying Docker image and CPU+RAM resources if necessary.</p><p id="f893" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Within 10–15 minutes of checking in a file like above, the app will be live at an easily shareable URL like <a class="af pp" href="https://youridea.airbnb.proxy/" rel="noopener ugc nofollow" target="_blank">https://youridea.airbnb.proxy/</a> , where it can be shared with anyone at the company who has a working corporate login. Sandcastle also handles “identity propagation” from visiting users to the underlying data warehouse infrastructure, to ensure that applications respect user permissions around accessing sensitive metrics and tables.</p><h1 id="15d9" class="om on gu bf oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Replicating Sandcastle</h1><p id="3e28" class="pw-post-body-paragraph mw mx gu my b mz pk nb nc nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt gn bk">Product ideas powered by data and AI are best developed through rapid iteration on shareable, lightweight live prototypes, instead of static proposals. There are multiple challenges to facilitating the creation of secure internal prototypes. Open source frameworks like Streamlit and <a class="af pp" href="https://dash.plotly.com/" rel="noopener ugc nofollow" target="_blank">Dash</a> help, but aren’t enough: you also need a hosting platform. It doesn’t make sense to open source Sandcastle, because the answers to “how does my service talk to others” or “how does authentication work” are so different across company infrastructures. Instead, <strong class="my gv">any company can use Sandcastle’s approach as a recipe: 1) Application: adapt open source web application frameworks to their bespoke tech stack with 2) Hosting platform: that handles authentication, networking</strong> and provides shareable links.</p><p id="e371" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Here is a quick summary of the things you’ll need to think about if you hope to build a “Sandcastle” for your own company:</p><ul class=""><li id="755e" class="mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt qq qr qs bk"><strong class="my gv">Open source web application framework(s)</strong>: At Airbnb we largely use <a class="af pp" href="https://docs.streamlit.io/" rel="noopener ugc nofollow" target="_blank">Streamlit</a> for data science prototyping, with a bit of <a class="af pp" href="https://fastapi.tiangolo.com/" rel="noopener ugc nofollow" target="_blank">FastAPI</a> and <a class="af pp" href="https://react.dev/reference/react" rel="noopener ugc nofollow" target="_blank">React</a> for more bespoke prototypes. Prioritize ease of development (especially hot reload), a rich ecosystem of open source components, and performant UIs via caching.</li><li id="74c0" class="mw mx gu my b mz qt nb nc nd qu nf ng nh qv nj nk nl qw nn no np qx nr ns nt qq qr qs bk"><strong class="my gv">Packaging system</strong>: a way of publishing snapshots of “data/AI prototype code” from DS/ML development environments to somewhere consumable from elsewhere in your tech stack. At Airbnb we use <a class="af pp" href="https://wamlm-kdd.github.io/wamlm/papers/wamlm-kdd23_paper_Daniel_Miller.pdf" rel="noopener ugc nofollow" target="_blank">Onebrain</a>, but there are many paid public alternatives.</li><li id="7d46" class="mw mx gu my b mz qt nb nc nd qu nf ng nh qv nj nk nl qw nn no np qx nr ns nt qq qr qs bk"><strong class="my gv">Reproducible runs of DS/ML code:</strong> this should include Python / Conda environment management. Airbnb uses Onebrain for this as well, but you may consider <a class="af pp" href="https://pip.pypa.io/en/stable/installation/" rel="noopener ugc nofollow" target="_blank">pip</a>.</li></ul><p id="ba48" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In addition, you’ll need prototyping-friendly solutions for the three pillars of cloud computing:</p><ul class=""><li id="de55" class="mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt qq qr qs bk"><strong class="my gv">Compute</strong>: spin up a remote hosting environment with little or ideally no complicated infrastructure configuration required.</li><li id="31ca" class="mw mx gu my b mz qt nb nc nd qu nf ng nh qv nj nk nl qw nn no np qx nr ns nt qq qr qs bk"><strong class="my gv">Storage</strong>: access to ephemeral storage for caching and, more importantly, access to your company’s data warehouse infrastructure so prototypes can query your offline data.</li><li id="17dd" class="mw mx gu my b mz qt nb nc nd qu nf ng nh qv nj nk nl qw nn no np qx nr ns nt qq qr qs bk"><strong class="my gv">Networking</strong>: an authentication proxy that allows internal users to access prototypes, ideally via easily memorable domains like appname.yourproxy.io, and passes along user information so prototypes can pass visitor credentials through to the data warehouse or other services. Also, read-only access to other internal services so prototypes can query live data.</li></ul><p id="c7bc" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Build with a view towards “going viral”, and you’ll end up with a larger internal audience than you expect, especially if your platform is deliberately flexible. This allows their developers to focus on leveraging the rich open source prototyping ecosystem. More importantly, key stakeholders will be able to directly experience data/AI ideas at an early stage.</p><h1 id="d5a2" class="om on gu bf oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Conclusion</h1><p id="429b" class="pw-post-body-paragraph mw mx gu my b mz pk nb nc nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt gn bk">Sandcastle unlocked fast and easy deployment and iteration of new ideas, especially in the data and ML (including LLMs, generative AI) spaces. For the first time, data scientists and PMs are able to directly iterate on interactive versions of their ideas, without needing lengthy cycles for prioritization with an engineering team.</p><p id="2de9" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Airbnb’s data science, engineering, and product management community developed over 175 live prototypes in the last year, 6 of which were used for high-impact use cases. These were visited by over 3.5k unique internal visitors across over 69k distinct active days. Hundreds of internal users a week visit one of our many internal prototypes to directly interact with them. This led to an ongoing cultural shift from using decks / docs to using live prototypes</p><p id="a233" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">If this type of work interests you, check out some of our related positions:</p><ul class=""><li id="51a0" class="mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt qq qr qs bk"><a class="af pp" href="https://careers.airbnb.com/positions/5875290/" rel="noopener ugc nofollow" target="_blank">Staff data scientist — Algorithms, Trust</a></li><li id="c8b3" class="mw mx gu my b mz qt nb nc nd qu nf ng nh qv nj nk nl qw nn no np qx nr ns nt qq qr qs bk"><a class="af pp" href="https://careers.airbnb.com/positions/5927030/" rel="noopener ugc nofollow" target="_blank">Senior Staff Software Engineer, AI for Developer Productivity</a></li><li id="7c7c" class="mw mx gu my b mz qt nb nc nd qu nf ng nh qv nj nk nl qw nn no np qx nr ns nt qq qr qs bk">… and more at <a class="af pp" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">Careers at Airbnb</a>!</li></ul><p id="5784" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">You can also learn more about data science and AI at Airbnb by checking out <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/airbnb-at-kdd-2023-9084ad244d8c">Airbnb at KDD 2023</a>, <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/airbnb-brandometer-powering-brand-perception-measurement-on-social-media-data-with-ai-c83019408051">Airbnb Brandometer: Powering Brand Perception Measurement on Social Media Data with AI</a>, and <a class="af pp" rel="noopener" href="https://medium.com/airbnb-engineering/chronon-airbnbs-ml-feature-platform-is-now-open-source-d9c4dba859e8">Chronon, Airbnb’s ML Feature Platform, Is Now Open Source</a>.</p><h1 id="f294" class="om on gu bf oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj bk">Acknowledgments</h1><p id="bb3e" class="pw-post-body-paragraph mw mx gu my b mz pk nb nc nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt gn bk">Thanks to:</p><ul class=""><li id="1d0d" class="mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt qq qr qs bk"><a class="qy it fe" href="https://medium.com/u/55e89bdb611e" rel="noopener" target="_blank">Chris C Williams</a>, <a class="af pp" href="mailto:erik.iverson@airbnb.com" rel="noopener ugc nofollow" target="_blank">Erik Iverson</a>, <a class="qy it fe" href="https://medium.com/u/d3e044d2752c" rel="noopener" target="_blank">Mike Dodge</a>, <a class="qy it fe" href="https://medium.com/u/910acc9d635b" rel="noopener" target="_blank">Patrick Srail</a> — for being early adopters whose feedback was critical in shaping Sandcastle’s evolution.</li><li id="0212" class="mw mx gu my b mz qt nb nc nd qu nf ng nh qv nj nk nl qw nn no np qx nr ns nt qq qr qs bk"><a class="qy it fe" href="https://medium.com/u/d42a7a48aeb8" rel="noopener" target="_blank">Alex Deng</a>, <a class="af pp" href="mailto:carolina.barcenas@airbnb.com" rel="noopener ugc nofollow" target="_blank">Carolina Barcenas</a>, <a class="af pp" href="mailto:navin.sivanandam@airbnb.com" rel="noopener ugc nofollow" target="_blank">Navin Sivanandam</a> — for their leadership support.</li></ul></div>]]></description>
      <link>https://medium.com/airbnb-engineering/sandcastle-data-ai-apps-for-everyone-439f3b78b223</link>
      <guid>https://medium.com/airbnb-engineering/sandcastle-data-ai-apps-for-everyone-439f3b78b223</guid>
      <pubDate>Tue, 24 Sep 2024 19:01:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Riverbed Data Hydration — Part 1]]></title>
      <description><![CDATA[<div><div></div><figure class="mz na nb nc nd ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx my"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*7kF7y_GLrhJyalhRpaHgTg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="c7b3" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">A deep dive into the streaming aspect of the Lambda architecture framework that optimizes how data is consumed from system-of-record data stores and updates secondary read-optimized stores at Airbnb.</p><h1 id="ca28" class="oi oj gu bf ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf bk">Overview</h1><p id="1d33" class="pw-post-body-paragraph nk nl gu nm b nn pg np nq nr ph nt nu nv pi nx ny nz pj ob oc od pk of og oh gn bk">In our <a class="af pl" rel="noopener" href="https://medium.com/airbnb-engineering/riverbed-optimizing-data-access-at-airbnbs-scale-c37ecf6456d9">previous blog post</a> we introduced the motivation and high-level architecture of Riverbed. As a recap, Riverbed is a part of Airbnb’s tech stack designed to streamline and optimize how data is consumed from system-of-record data stores and update secondary read-optimized stores. The framework is built around the concept of ‘materialized views’ — denormalized representations of data that can be queried in a predictable, efficient manner. The primary goal of Riverbed is to improve scalability, enable more efficient data fetching patterns, and provide enhanced filtering and search capabilities for a better user experience. It achieves this by keeping the read-optimized store up-to-date with the system-of-record data stores, and by making it easier for developers to build and manage pipelines that stitch together data from various data sources.</p><p id="4074" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">In this blog post, we will delve deeper into the streaming aspect of the Lambda architecture framework. We’ll discuss step by step its critical components and explain how it constructs and sinks the materialized view from the Change Data Capture (CDC) events of various online data sources. Specifically, we’ll take a closer look at the join transformation within the Notification Pipeline, illustrating how we designed a DAG-like data structure to efficiently join different data sources together in a memory-efficient manner.</p><p id="fc5b" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">To make the framework and its components easier to understand, let’s begin with a simplified example of a Riverbed pipeline definition:</p><pre class="pm pn po pp pq pr ps pt bp pu bb bk">{<br />  Review {<br />    id @documentId<br />    reviewUser {<br />      id<br />      firstName<br />      lastName<br />    }<br />  }<br />}</pre><p id="71a1" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Riverbed provides a declarative schema-based interface for customers to define Riverbed pipelines. From the sample definition above, a Riverbed pipeline is configured to integrate data sources from the Review and User entities, generating Riverbed sink documents with the review ID as the materialized view document ID.</p><p id="08aa" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Based on this definition, Riverbed generates two types of streaming pipelines:</p><ul class=""><li id="0833" class="nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh qa qb qc bk"><strong class="nm gv">Source Pipelines:</strong> Two pipelines consume CDC events from the Review and User tables respectively and publish Apache Kafka® events known as notification events, indicating which documents need to be refreshed.</li><li id="8563" class="nk nl gu nm b nn qd np nq nr qe nt nu nv qf nx ny nz qg ob oc od qh of og oh qa qb qc bk"><strong class="nm gv">Notification Pipeline: </strong>This pipeline consumes the notification events published by the source pipelines and constructs materialized view documents to be written into sink stores.</li></ul><p id="6732" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Now, let us delve deeper into these two types of pipelines.</p><h2 id="c5a5" class="qi oj gu bf ok qj qk dy oo ql qm ea os nv qn qo qp nz qq qr qs od qt qu qv qw bk">Source Pipeline</h2><figure class="pm pn po pp pq ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx qx"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*1G_PTIIp4nuq9VmM 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*1G_PTIIp4nuq9VmM 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*1G_PTIIp4nuq9VmM 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*1G_PTIIp4nuq9VmM 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*1G_PTIIp4nuq9VmM 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*1G_PTIIp4nuq9VmM 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*1G_PTIIp4nuq9VmM 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*1G_PTIIp4nuq9VmM 640w, https://miro.medium.com/v2/resize:fit:720/0*1G_PTIIp4nuq9VmM 720w, https://miro.medium.com/v2/resize:fit:750/0*1G_PTIIp4nuq9VmM 750w, https://miro.medium.com/v2/resize:fit:786/0*1G_PTIIp4nuq9VmM 786w, https://miro.medium.com/v2/resize:fit:828/0*1G_PTIIp4nuq9VmM 828w, https://miro.medium.com/v2/resize:fit:1100/0*1G_PTIIp4nuq9VmM 1100w, https://miro.medium.com/v2/resize:fit:1400/0*1G_PTIIp4nuq9VmM 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qy ff qz mw mx ra rb bf b bg z du"><strong class="bf ok">Picture 1. High-level system diagram of Riverbed</strong></figcaption></figure><p id="8192" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Picture 1 shows the Source Pipeline as the first component in Riverbed. It is an auto-generated pipeline that listens to changes in system-of-record data sources. When changes occur, the Source Pipeline constructs NotificationEvents and emits them onto the Notification Kafka® topic to notify the Notification Pipeline on which documents should be refreshed. In the event-driven architecture of Riverbed, the Source Pipeline acts as the initial trigger for real-time updates in the read-optimized store. It not only ensures that the mutations in the underlying data sources are appropriately captured and communicated to the Notification Pipeline for subsequent processing, but also is the key solution for the concurrency and versioning issues in the framework.</p><p id="8373" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">While the emphasis of this blog post is the Notification Pipeline, a detailed exploration of the Source Pipeline — especially its critical role in maintaining real-time data consistency and its interaction with Notification Pipelines — will be discussed in the next blog post of this series.</p><h2 id="6d46" class="qi oj gu bf ok qj qk dy oo ql qm ea os nv qn qo qp nz qq qr qs od qt qu qv qw bk">Notification Pipeline</h2><figure class="pm pn po pp pq ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx rc"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*fu9QDy6NTigmN0ac 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*fu9QDy6NTigmN0ac 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*fu9QDy6NTigmN0ac 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*fu9QDy6NTigmN0ac 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*fu9QDy6NTigmN0ac 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*fu9QDy6NTigmN0ac 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*fu9QDy6NTigmN0ac 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*fu9QDy6NTigmN0ac 640w, https://miro.medium.com/v2/resize:fit:720/0*fu9QDy6NTigmN0ac 720w, https://miro.medium.com/v2/resize:fit:750/0*fu9QDy6NTigmN0ac 750w, https://miro.medium.com/v2/resize:fit:786/0*fu9QDy6NTigmN0ac 786w, https://miro.medium.com/v2/resize:fit:828/0*fu9QDy6NTigmN0ac 828w, https://miro.medium.com/v2/resize:fit:1100/0*fu9QDy6NTigmN0ac 1100w, https://miro.medium.com/v2/resize:fit:1400/0*fu9QDy6NTigmN0ac 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qy ff qz mw mx ra rb bf b bg z du"><strong class="bf ok">Picture 2. Notification Pipeline components</strong></figcaption></figure><p id="71fa" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">The Notification Pipeline is the core component of the Riverbed framework. It consumes Notification events, then queries dependent data sources and stitches together “documents” that are written into a read-optimized sink to support a materialized view. A notification event is processed by the following operations:</p><ul class=""><li id="b3c3" class="nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh qa qb qc bk"><strong class="nm gv">Ingestion: </strong>For every change to a data source that the Read-Optimized Store is dependent on, we must re-index all affected documents to ensure freshness of data. In this step, Notification Pipeline consumes Notification events from Kafka® and deserializes them into objects that simply contain the document ID and primary source ID.</li><li id="6a8d" class="nk nl gu nm b nn qd np nq nr qe nt nu nv qf nx ny nz qg ob oc od qh of og oh qa qb qc bk"><strong class="nm gv">Join: </strong>Based on these deserialized objects, Notification Pipeline queries various data stores to fetch all data sources that are necessary for building the materialized view.</li><li id="21a8" class="nk nl gu nm b nn qd np nq nr qe nt nu nv qf nx ny nz qg ob oc od qh of og oh qa qb qc bk"><strong class="nm gv">Stitch: </strong>This step models the join results from various data sources into a comprehensive Java Pojo called StitchModel, so that engineers can perform further customized data processing on it.</li><li id="8cac" class="nk nl gu nm b nn qd np nq nr qe nt nu nv qf nx ny nz qg ob oc od qh of og oh qa qb qc bk"><strong class="nm gv">Operate: </strong>In this step, a chain of various operators including filter, map, flatMap, etc, containing product-specific business logic can be applied to the StitchModel to convert it into the final document structure that will be stored in the index.</li><li id="78b8" class="nk nl gu nm b nn qd np nq nr qe nt nu nv qf nx ny nz qg ob oc od qh of og oh qa qb qc bk"><strong class="nm gv">Sink: </strong>As the last step, documents can be drained into various data sinks to refresh the materialized views.</li></ul><p id="41f9" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Among these operations, Join, Stitch and Sink are the most important as well as the most complicated ones. In the following sections, we will dive deeper into their design.</p><h1 id="6673" class="oi oj gu bf ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf bk">Data Source Join</h1><p id="5da0" class="pw-post-body-paragraph nk nl gu nm b nn pg np nq nr ph nt nu nv pi nx ny nz pj ob oc od pk of og oh gn bk">One of the most crucial and intricate operations in Riverbed’s Notification Pipeline is the Join operation. A Join operation starts from the primary source ID and then fetches data for all data sources associated with the materialized view based on their relationship.</p><h2 id="5ee3" class="qi oj gu bf ok qj qk dy oo ql qm ea os nv qn qo qp nz qq qr qs od qt qu qv qw bk">JoinConditionsDag</h2><p id="0fdd" class="pw-post-body-paragraph nk nl gu nm b nn pg np nq nr ph nt nu nv pi nx ny nz pj ob oc od pk of og oh gn bk">In Riverbed, we use JoinConditionsDag, a Directed Acyclic Graph, to store the relationship metadata among data sources, where each node represents one unique data source and each edge represents the join condition between two data sources. In the Notification Pipelines, JoinConditionsDag’s root node is always a metadata node for the notification event which contains the document ID and the primary source ID. The join condition connecting to the notification event node reflects the join condition to query the primary source. Below is a sample JoinConditionsDag defining the join relationship between the primary source Listing and some of its related data sources:</p><figure class="pm pn po pp pq ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx rd"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*NocSoXiS0T3mt6M7 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*NocSoXiS0T3mt6M7 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*NocSoXiS0T3mt6M7 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*NocSoXiS0T3mt6M7 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*NocSoXiS0T3mt6M7 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*NocSoXiS0T3mt6M7 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*NocSoXiS0T3mt6M7 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*NocSoXiS0T3mt6M7 640w, https://miro.medium.com/v2/resize:fit:720/0*NocSoXiS0T3mt6M7 720w, https://miro.medium.com/v2/resize:fit:750/0*NocSoXiS0T3mt6M7 750w, https://miro.medium.com/v2/resize:fit:786/0*NocSoXiS0T3mt6M7 786w, https://miro.medium.com/v2/resize:fit:828/0*NocSoXiS0T3mt6M7 828w, https://miro.medium.com/v2/resize:fit:1100/0*NocSoXiS0T3mt6M7 1100w, https://miro.medium.com/v2/resize:fit:1400/0*NocSoXiS0T3mt6M7 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qy ff qz mw mx ra rb bf b bg z du"><strong class="bf ok">Picture 3: JoinConditionsDag Sample</strong></figcaption></figure><p id="713c" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Given notification events are used to indicate which document needs to be refreshed and does not contain any source data, Notification Pipeline joins data sources starting from the primary source ID provided by the Notification event. Guided by the JoinConditionsDag, when the Notification Pipeline processes a Notification event containing the primarySourceId, it queries the Listing table to fetch Listing data where the id matches primarySourceId. Subsequently, leveraging this Listing data, it queries the ListingDescription and Room tables to retrieve listing descriptions and rooms data, respectively, where the listingId equals id of Listing. In a similar manner, RoomAmenity data is obtained with roomId matching the id of the Room data.</p><h2 id="2b68" class="qi oj gu bf ok qj qk dy oo ql qm ea os nv qn qo qp nz qq qr qs od qt qu qv qw bk">JoinResultsDag</h2><p id="5a74" class="pw-post-body-paragraph nk nl gu nm b nn pg np nq nr ph nt nu nv pi nx ny nz pj ob oc od pk of og oh gn bk">Now, we have the JoinConditionsDag guiding the Notification Pipeline to fetch all data sources. However, the question arises: how can we efficiently store the query results? One straightforward option is to flatten all the joined results into a table-like structure. Yet, this approach can consume a significant amount of memory, especially when performing joins with high cardinality. To optimize memory usage, we designed another DAG-like data structure named JoinResultsDag.</p><figure class="pm pn po pp pq ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx re"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*u64DV1qwKOvnEBUi 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*u64DV1qwKOvnEBUi 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*u64DV1qwKOvnEBUi 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*u64DV1qwKOvnEBUi 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*u64DV1qwKOvnEBUi 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*u64DV1qwKOvnEBUi 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*u64DV1qwKOvnEBUi 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*u64DV1qwKOvnEBUi 640w, https://miro.medium.com/v2/resize:fit:720/0*u64DV1qwKOvnEBUi 720w, https://miro.medium.com/v2/resize:fit:750/0*u64DV1qwKOvnEBUi 750w, https://miro.medium.com/v2/resize:fit:786/0*u64DV1qwKOvnEBUi 786w, https://miro.medium.com/v2/resize:fit:828/0*u64DV1qwKOvnEBUi 828w, https://miro.medium.com/v2/resize:fit:1100/0*u64DV1qwKOvnEBUi 1100w, https://miro.medium.com/v2/resize:fit:1400/0*u64DV1qwKOvnEBUi 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qy ff qz mw mx ra rb bf b bg z du"><strong class="bf ok">Picture 4: JoinResultsDag Structure</strong></figcaption></figure><p id="7b69" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">There are two major components in a JoinResultsDag. <strong class="nm gv">Cell</strong> is the atomic container for a data record. Each cell maintains its own successor relationships by mapping successor data source aliases to the CellGroups. <strong class="nm gv">CellGroup</strong> is the container to store the joined records from one data source. Each data source table record is stored in each Cell.</p><p id="9198" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">As mentioned above, the biggest difference and the advantage of using a DAG-based data structure instead of using the traditional flat join table is that it can efficiently store a large amount of join result data especially when there is a 1:M or M:N join relationship between data sources. For example, we have one pipeline to create materialized views for Airbnb Listings with information about all their Listing rooms, which also have lots of room amenities. If we use the traditional flat join table, it will look like the following table.</p><figure class="pm pn po pp pq ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx rf"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*lv66TN0AvjitObxDPWnj9w.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*lv66TN0AvjitObxDPWnj9w.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*lv66TN0AvjitObxDPWnj9w.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*lv66TN0AvjitObxDPWnj9w.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*lv66TN0AvjitObxDPWnj9w.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*lv66TN0AvjitObxDPWnj9w.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*lv66TN0AvjitObxDPWnj9w.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*lv66TN0AvjitObxDPWnj9w.png 640w, https://miro.medium.com/v2/resize:fit:720/1*lv66TN0AvjitObxDPWnj9w.png 720w, https://miro.medium.com/v2/resize:fit:750/1*lv66TN0AvjitObxDPWnj9w.png 750w, https://miro.medium.com/v2/resize:fit:786/1*lv66TN0AvjitObxDPWnj9w.png 786w, https://miro.medium.com/v2/resize:fit:828/1*lv66TN0AvjitObxDPWnj9w.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*lv66TN0AvjitObxDPWnj9w.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*lv66TN0AvjitObxDPWnj9w.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="11c2" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Obviously, storing joined results using a flat table structure demands extensive resources for both storage and processing. In contrast, JoinResultsDag effectively mitigates data duplication by allowing multiple successor nodes to refer back to the same ancestor nodes.</p><figure class="pm pn po pp pq ne mw mx paragraph-image"><div role="button" tabindex="0" class="nf ng fj nh bh ni"><div class="mw mx rg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*6AZEVqhj2MP0zdMp 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*6AZEVqhj2MP0zdMp 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*6AZEVqhj2MP0zdMp 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*6AZEVqhj2MP0zdMp 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*6AZEVqhj2MP0zdMp 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*6AZEVqhj2MP0zdMp 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*6AZEVqhj2MP0zdMp 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*6AZEVqhj2MP0zdMp 640w, https://miro.medium.com/v2/resize:fit:720/0*6AZEVqhj2MP0zdMp 720w, https://miro.medium.com/v2/resize:fit:750/0*6AZEVqhj2MP0zdMp 750w, https://miro.medium.com/v2/resize:fit:786/0*6AZEVqhj2MP0zdMp 786w, https://miro.medium.com/v2/resize:fit:828/0*6AZEVqhj2MP0zdMp 828w, https://miro.medium.com/v2/resize:fit:1100/0*6AZEVqhj2MP0zdMp 1100w, https://miro.medium.com/v2/resize:fit:1400/0*6AZEVqhj2MP0zdMp 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qy ff qz mw mx ra rb bf b bg z du"><strong class="bf ok">Picture 5: JoinResultsDag Example</strong></figcaption></figure><p id="a066" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Now with JoinConditionsDag representing the relationship among all data sources and JoinResultsDag storing all the results, joins can be performed in Riverbed roughly as follows:</p><p id="bd13" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Starting from the NotificationEvent, Riverbed first initializes a JoinResultsDag with the deserialized Notification event as root. Then guided by the JoinConditionsDag and following a depth-first-search traverse, it visits the data store of each source, queries data based on the join conditions defined on the JoinConditionsDag edges, encapsulates the query results rows inside each Cell and then continues fetching the data of its dependencies until finished visiting all data sources.</p><h1 id="1235" class="oi oj gu bf ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf bk">Stitching of Data</h1><p id="fa03" class="pw-post-body-paragraph nk nl gu nm b nn pg np nq nr ph nt nu nv pi nx ny nz pj ob oc od pk of og oh gn bk">With the joined results now stored in JoinResultsDag, an additional operation is necessary to transform these varied data pieces into a more usable and functional model. This enables engineers to apply their custom operators, mapping the data onto their specifically designed Sink Document. We refer to this process as the Stitch Operation, resulting in what is known as the StitchModel.</p><p id="6c10" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">The StitchModel, a Java <a class="af pl" href="https://en.wikipedia.org/wiki/Plain_old_Java_object" rel="noopener ugc nofollow" target="_blank">POJO</a> derived from the custom pipeline definition, serves as the intermediate data model that not only contains the actual data but also contains useful metadata about the event such as document ID, version, mutation source, etc.</p><p id="2b44" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">After the StitchModel metadata is generated, with the help of the JoinResultsDag, the Stitch operation is more straightforward. It maps the JoinResultsDag into a JSON model with the same structure and then converts the JSON model into the custom defined Java POJO utilizing the GSON library.</p><h1 id="8840" class="oi oj gu bf ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf bk">Sink data</h1><p id="2041" class="pw-post-body-paragraph nk nl gu nm b nn pg np nq nr ph nt nu nv pi nx ny nz pj ob oc od pk of og oh gn bk">The final stage in Riverbed’s Notification Pipeline is to write documents into data sinks. In Riverbed, sinks define where the processed data, now in the form of documents, will be ingested after the preceding operations are completed. Riverbed allows for multiple sinks, including Apache Hive(™) and Kafka®, so the same data can be ingested into multiple storage locations if required. This flexibility is a key advantage of the Notification Pipeline, enabling it to cater to a wide variety of use cases.</p><p id="cb04" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk">Riverbed writes documents into data sinks via their write APIs. For the best performance, it encapsulates a collection of documents into the API request and then makes use of the batched write API of each data sink to update multiple documents efficiently.</p><h1 id="8451" class="oi oj gu bf ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf bk">Summary</h1><p id="722c" class="pw-post-body-paragraph nk nl gu nm b nn pg np nq nr ph nt nu nv pi nx ny nz pj ob oc od pk of og oh gn bk">In conclusion, we’ve navigated the critical steps of Riverbed’s streaming system within the Lambda architecture framework, focusing on the construction of materialized views from CDC events. Our highlight on the join transformation within the Notification Pipeline showcased a DAG-like structure for efficient and memory-conscious data joining. This discussion has shed light on the architectural approach to constructing materialized views in streaming and introduced innovative data structure designs for optimizing streaming data joins. Looking ahead, we will delve deeper into the Source Pipeline of the streaming system and explore the batch system of Riverbed, continuing our journey through advanced data architecture solutions.</p><p id="4c1a" class="pw-post-body-paragraph nk nl gu nm b nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh gn bk"><em class="rh">If this kind of work sounds appealing to you, check out our </em><a class="af pl" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"><em class="rh">open roles</em></a><em class="rh"> — we’re hiring!</em></p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/riverbed-data-hydration-part-1-e7011d62d946</link>
      <guid>https://medium.com/airbnb-engineering/riverbed-data-hydration-part-1-e7011d62d946</guid>
      <pubDate>Tue, 10 Sep 2024 18:01:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Building Postcards for “Airbnb” Scale]]></title>
      <description><![CDATA[<div class="ab cb"><div class="ci bh fz ga gb gc"><div><div></div><p id="cc97" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">By: <a class="af nu" rel="noopener" href="https://medium.com/@Sundiata">Leo Wong</a>, <a class="af nu" rel="noopener" href="https://medium.com/@henry.johnson_26073">Henry Johnson</a></p><p id="93cb" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">How the Airbnb Media team built group travel Postcards for the 2024 Summer Release by leveraging a novel destination matching algorithm while advancing the platform’s image &amp; localized text processing capabilities.</p></div></div><div class="nv"><div class="ab cb"><div class="ly nw lz nx ma ny cf nz cg oa ci bh"><figure class="oe of og oh oi nv oj ok paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="ob oc od"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 1100w, https://miro.medium.com/v2/resize:fit:2000/format:webp/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 2000w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 1100w, https://miro.medium.com/v2/resize:fit:2000/1*2J5D7hHVULfMa9n3cTI3wA.jpeg 2000w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px" /></picture></div></div><figcaption class="oq ff or ob oc os ot bf b bg z du">Airbnb Postcards (see <a class="af nu" href="https://www.airbnb.com/release/" rel="noopener ugc nofollow" target="_blank">announcement</a>).</figcaption></figure></div></div></div><div class="ab cb"><div class="ci bh fz ga gb gc"><h1 id="39fa" class="ou ov gu bf ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr bk">Introduction</h1><p id="02c2" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">For Airbnb’s 2024 Summer Release, the Media Ingestion team at Airbnb took on the exciting challenge of creating a reliable postcard generation system to generate unique, hand-crafted Postcards. Postcards are a beautiful way to invite guests on a trip while keeping friends and family in the loop (see <a class="af nu" href="https://www.airbnb.com/release/" rel="noopener ugc nofollow" target="_blank">announcement</a>). This feature required a novel solution to match relevant postcards to every possible destinationthat guests booked on Airbnb. It needed to render performantly not only on all our client platforms (iOS, Android, and Web), but also on different messaging platforms outside the Airbnb app, all while maintaining Airbnb’s high design standards.</p><h1 id="104f" class="ou ov gu bf ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr bk">Challenges</h1><ul class=""><li id="f310" class="mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px py pz bk"><strong class="my gv">Localized Text Layout: </strong>Postcards have strict design guidelines around character count per line, font leading &amp; kerning per language, pixel perfect typography, line break rules, and language-specific styling.</li><li id="0d0c" class="mw mx gu my b mz qa nb nc nd qb nf ng nh qc nj nk nl qd nn no np qe nr ns nt px py pz bk"><strong class="my gv">Design &amp; Product Flexibility: </strong>Text layout, color, fonts, text drop shadows and image transformations need to be flexible for product &amp; design changes.</li><li id="941b" class="mw mx gu my b mz qa nb nc nd qb nf ng nh qc nj nk nl qd nn no np qe nr ns nt px py pz bk"><strong class="my gv">Destination Matching: </strong>Postcards need to match the destination by including relevant artwork and localized destination names.</li><li id="cfb9" class="mw mx gu my b mz qa nb nc nd qb nf ng nh qc nj nk nl qd nn no np qe nr ns nt px py pz bk"><strong class="my gv">Availability On and Off Platform: </strong>Assets need to be surfaced on and off the platform, which necessitated a pre-generated server-side solution; client-only solutions wouldn’t work since we needed Open Graph compatible links for assets to render properly in iMessage and Instagram, for example.</li><li id="14aa" class="mw mx gu my b mz qa nb nc nd qb nf ng nh qc nj nk nl qd nn no np qe nr ns nt px py pz bk"><strong class="my gv">Performance:</strong> Postcard presentation shouldn’t interrupt the product UX by taking significant time to render.</li></ul><figure class="oe of og oh oi nv ob oc paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="ob oc qf"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*KPS7rdPqtbiowRSRgRpSXw.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*KPS7rdPqtbiowRSRgRpSXw.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*KPS7rdPqtbiowRSRgRpSXw.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*KPS7rdPqtbiowRSRgRpSXw.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*KPS7rdPqtbiowRSRgRpSXw.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*KPS7rdPqtbiowRSRgRpSXw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*KPS7rdPqtbiowRSRgRpSXw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*KPS7rdPqtbiowRSRgRpSXw.png 640w, https://miro.medium.com/v2/resize:fit:720/1*KPS7rdPqtbiowRSRgRpSXw.png 720w, https://miro.medium.com/v2/resize:fit:750/1*KPS7rdPqtbiowRSRgRpSXw.png 750w, https://miro.medium.com/v2/resize:fit:786/1*KPS7rdPqtbiowRSRgRpSXw.png 786w, https://miro.medium.com/v2/resize:fit:828/1*KPS7rdPqtbiowRSRgRpSXw.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*KPS7rdPqtbiowRSRgRpSXw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*KPS7rdPqtbiowRSRgRpSXw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="oq ff or ob oc os ot bf b bg z du"><strong class="bf ow">Postcards “in vs. out” of product experience</strong></figcaption></figure><h1 id="c0b2" class="ou ov gu bf ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr bk">Solution</h1><h2 id="6ca0" class="qg ov gu bf ow qh qi dy pa qj qk ea pe nh ql qm qn nl qo qp qq np qr qs qt qu bk">Postcard Setup</h2><p id="cac7" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">A Trips Postcard is the combination of (1) an artwork illustration, (2) a postcard template, and (3) a localized destination. A Postcard “template” is a data entity with a reference to the artwork illustration plus some additional metadata to describe how to render it. To render a Postcard in product, we need to have all artwork illustrations, postcard templates, and formatted localized destinations set up before a visitor sees the product flow.</p><p id="47ab" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In the example below, we have an illustration of a cliff side, a postcard template, and an English destination name of “Galway”. The Postcard template includes parameters to specify how to create the postcard, like text and Belo (Airbnb brand icon) color and positioning. In the example, the text and Belo are rendered in gray color and positioned at the bottom and top left, respectively.</p><figure class="oe of og oh oi nv ob oc paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="ob oc qv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*mTltSCyOmosoAVx4 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*mTltSCyOmosoAVx4 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*mTltSCyOmosoAVx4 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*mTltSCyOmosoAVx4 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*mTltSCyOmosoAVx4 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*mTltSCyOmosoAVx4 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*mTltSCyOmosoAVx4 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*mTltSCyOmosoAVx4 640w, https://miro.medium.com/v2/resize:fit:720/0*mTltSCyOmosoAVx4 720w, https://miro.medium.com/v2/resize:fit:750/0*mTltSCyOmosoAVx4 750w, https://miro.medium.com/v2/resize:fit:786/0*mTltSCyOmosoAVx4 786w, https://miro.medium.com/v2/resize:fit:828/0*mTltSCyOmosoAVx4 828w, https://miro.medium.com/v2/resize:fit:1100/0*mTltSCyOmosoAVx4 1100w, https://miro.medium.com/v2/resize:fit:1400/0*mTltSCyOmosoAVx4 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="oq ff or ob oc os ot bf b bg z du"><strong class="bf ow"><em class="qw">Illustration + Template Params + Localized Destination = Postcard</em></strong></figcaption></figure><h2 id="04ad" class="qg ov gu bf ow qh qi dy pa qj qk ea pe nh ql qm qn nl qo qp qq np qr qs qt qu bk"><strong class="al">Design Flexibility — Postcard Templates</strong></h2><p id="8211" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">To accommodate changing design requirements, we built a flexible template data model that empowers our design team to configure various parameters like text positioning and text color while iterating on the designs. The postcard templates include all the metadata required to generate a Postcard and its surrounding presentation elements. It also includes a versioning capability so that we can publish changes to all users with a version bump whenever we have a design revision or visual defect.</p><h2 id="97d5" class="qg ov gu bf ow qh qi dy pa qj qk ea pe nh ql qm qn nl qo qp qq np qr qs qt qu bk"><strong class="al">Template &amp; Artwork Upload, Management &amp; Preview</strong></h2><p id="84cc" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">To make it easy for the creative team to self-serve and debug issues, we built a web-based internal tool for creating and managing templates, previewing postcards, and uploading artwork. This tool made it significantly easier to manage templates, and was especially useful during the team’s peak iteration period where we were constantly fixing bugs and changing designs.</p><figure class="oe of og oh oi nv ob oc paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="ob oc qx"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*RgCM_RX07hZGpBGW 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*RgCM_RX07hZGpBGW 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*RgCM_RX07hZGpBGW 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*RgCM_RX07hZGpBGW 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*RgCM_RX07hZGpBGW 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*RgCM_RX07hZGpBGW 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*RgCM_RX07hZGpBGW 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*RgCM_RX07hZGpBGW 640w, https://miro.medium.com/v2/resize:fit:720/0*RgCM_RX07hZGpBGW 720w, https://miro.medium.com/v2/resize:fit:750/0*RgCM_RX07hZGpBGW 750w, https://miro.medium.com/v2/resize:fit:786/0*RgCM_RX07hZGpBGW 786w, https://miro.medium.com/v2/resize:fit:828/0*RgCM_RX07hZGpBGW 828w, https://miro.medium.com/v2/resize:fit:1100/0*RgCM_RX07hZGpBGW 1100w, https://miro.medium.com/v2/resize:fit:1400/0*RgCM_RX07hZGpBGW 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="oq ff or ob oc os ot bf b bg z du">The postcard template creation &amp; management form</figcaption></figure></div></div><div class="nv bh"><figure class="qy qz ra rb rc nv bh paragraph-image"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*v-CoGt4P0A_ABvjk2T-tQg.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*v-CoGt4P0A_ABvjk2T-tQg.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*v-CoGt4P0A_ABvjk2T-tQg.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*v-CoGt4P0A_ABvjk2T-tQg.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*v-CoGt4P0A_ABvjk2T-tQg.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*v-CoGt4P0A_ABvjk2T-tQg.png 1100w, https://miro.medium.com/v2/resize:fit:4548/format:webp/1*v-CoGt4P0A_ABvjk2T-tQg.png 4548w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 2274px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*v-CoGt4P0A_ABvjk2T-tQg.png 640w, https://miro.medium.com/v2/resize:fit:720/1*v-CoGt4P0A_ABvjk2T-tQg.png 720w, https://miro.medium.com/v2/resize:fit:750/1*v-CoGt4P0A_ABvjk2T-tQg.png 750w, https://miro.medium.com/v2/resize:fit:786/1*v-CoGt4P0A_ABvjk2T-tQg.png 786w, https://miro.medium.com/v2/resize:fit:828/1*v-CoGt4P0A_ABvjk2T-tQg.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*v-CoGt4P0A_ABvjk2T-tQg.png 1100w, https://miro.medium.com/v2/resize:fit:4548/1*v-CoGt4P0A_ABvjk2T-tQg.png 4548w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 2274px" /></picture><figcaption class="oq ff or ob oc os ot bf b bg z du">In the setup workflow, our operations team created Postcard templates, uploaded artwork, and managed them via the template form page.</figcaption></figure></div><div class="ab cb"><div class="ci bh fz ga gb gc"><h2 id="6696" class="qg ov gu bf ow qh qi dy pa qj qk ea pe nh ql qm qn nl qo qp qq np qr qs qt qu bk"><strong class="al">Localized Text Layout</strong></h2><p id="cf1a" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">We wanted accurate translations of destinations that were properly formatted for each localized postcard. A programmatic solution for localized text layout would require, at the very least, language specific rules (right to left, word wrapping, etc.), knowledge of cultural conventions, accessibility considerations, and text rendering for special characters (diacritics, etc.). This would make business logic complex and brittle.</p><p id="5e54" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">We compromised on this solution with the localization (l10n) team by manually formatting translations for destinations that made up our top booking destinations. These manually formatted translations involved getting our l10n scaled operations team to translate and format (line breaks, layout spacing, etc.) a shortlist of localized destinations, which we helped ingest into our typical i18n platform translated text workflow with some scripting. After ingestion, the Postcard generation system pulls from our i18n platform to get the desired localized formatted text layout for each Postcard.</p><p id="db84" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">The destinations shortlist was informed by our data science team, who helped gather the top booked destinations by language. This reduced the scale of required postcards to generate from all destinations in the full set of language locale destination combinations on the platform, to a magnitude of scale smaller destination subset. As a result, postcard QA was significantly easier. More importantly, it kept the overall system code and maintenance simple; no need for thousands of lines of language specific business logic!</p><p id="451c" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">The rest of the postcards that didn’t make the top destinations list followed a simple formula of word count and line breaks per language (e.g. Chinese, Korean, and Japanese had smaller word count limits because of the character size and no line break on spaces because it changes the meaning of the destination).</p><figure class="oe of og oh oi nv ob oc paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="ob oc qv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*V6OQN2tiaDycTdsT 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*V6OQN2tiaDycTdsT 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*V6OQN2tiaDycTdsT 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*V6OQN2tiaDycTdsT 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*V6OQN2tiaDycTdsT 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*V6OQN2tiaDycTdsT 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*V6OQN2tiaDycTdsT 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*V6OQN2tiaDycTdsT 640w, https://miro.medium.com/v2/resize:fit:720/0*V6OQN2tiaDycTdsT 720w, https://miro.medium.com/v2/resize:fit:750/0*V6OQN2tiaDycTdsT 750w, https://miro.medium.com/v2/resize:fit:786/0*V6OQN2tiaDycTdsT 786w, https://miro.medium.com/v2/resize:fit:828/0*V6OQN2tiaDycTdsT 828w, https://miro.medium.com/v2/resize:fit:1100/0*V6OQN2tiaDycTdsT 1100w, https://miro.medium.com/v2/resize:fit:1400/0*V6OQN2tiaDycTdsT 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="oq ff or ob oc os ot bf b bg z du">Early design prototype of text layout in a subset of the languages we support at Airbnb. The screenshot is for illustration purposes only.</figcaption></figure><h1 id="25de" class="ou ov gu bf ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr bk">Postcard Generation</h1><h2 id="c635" class="qg ov gu bf ow qh qi dy pa qj qk ea pe nh ql qm qn nl qo qp qq np qr qs qt qu bk"><strong class="al">Destination Matching</strong></h2><figure class="oe of og oh oi nv ob oc paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="ob oc rd"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*cXhikVfO3DtYE0AW 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*cXhikVfO3DtYE0AW 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*cXhikVfO3DtYE0AW 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*cXhikVfO3DtYE0AW 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*cXhikVfO3DtYE0AW 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*cXhikVfO3DtYE0AW 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*cXhikVfO3DtYE0AW 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*cXhikVfO3DtYE0AW 640w, https://miro.medium.com/v2/resize:fit:720/0*cXhikVfO3DtYE0AW 720w, https://miro.medium.com/v2/resize:fit:750/0*cXhikVfO3DtYE0AW 750w, https://miro.medium.com/v2/resize:fit:786/0*cXhikVfO3DtYE0AW 786w, https://miro.medium.com/v2/resize:fit:828/0*cXhikVfO3DtYE0AW 828w, https://miro.medium.com/v2/resize:fit:1100/0*cXhikVfO3DtYE0AW 1100w, https://miro.medium.com/v2/resize:fit:1400/0*cXhikVfO3DtYE0AW 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="oq ff or ob oc os ot bf b bg z du">Decision tree for determining the best postcard template for a given reservation</figcaption></figure><p id="0b76" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">To find the best-matching postcard template for each destination, we have a matching algorithm that matches templates to reservations at booking time using four different criteria:</p><ol class=""><li id="7aba" class="mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt re py pz bk">By <strong class="my gv">listing</strong> — We wanted to be able to support listing-specific artwork. E.g. Our <a class="af nu" href="https://airbnb.com/icons" rel="noopener ugc nofollow" target="_blank">Icons listings</a> show a golden ticket to commemorate the special moment when a guest wins the lottery for staying at an Icons listing.</li><li id="7313" class="mw mx gu my b mz qa nb nc nd qb nf ng nh qc nj nk nl qd nn no np qe nr ns nt re py pz bk">By <strong class="my gv">destination</strong> — For popular destinations (matching by city and country), we have curated artwork that showcases both a local artist and the destination. E.g. Trips to Santorini present the iconic Cycladic domes of Santorini as artwork (see diagram below).</li><li id="6b6e" class="mw mx gu my b mz qa nb nc nd qb nf ng nh qc nj nk nl qd nn no np qe nr ns nt re py pz bk">By <strong class="my gv">taxonomy</strong> — For all other artwork, we match destinations based on a set of taxonomy tags. We partnered with the knowledge graph team to apply taxonomy attributes to all of our listings in a few different categories: density (i.e. metropolitan, urban), climate (i.e. tropical, temperate) and geography (i.e. coastal, mountain, river). We ensured the taxonomy was accurate by cross referencing existing internal data and the expertise of our regional representatives teams. They then exposed an API that we called to fetch taxonomy by listing. On the operations side, our production team created taxonomy tagged artwork (e.g. an artwork tagged to be used for a coastal, temperate, metropolitan postcard). When generating postcards, we match the listing to the artwork with the highest number of overlapping tags.</li><li id="fbea" class="mw mx gu my b mz qa nb nc nd qb nf ng nh qc nj nk nl qd nn no np qe nr ns nt re py pz bk">By <strong class="my gv">default</strong> — If a destination isn’t covered by the above categories, we show a fallback default artwork.</li></ol></div></div><div class="nv"><div class="ab cb"><div class="ly nw lz nx ma ny cf nz cg oa ci bh"><figure class="oe of og oh oi nv oj ok paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="ob oc rf"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*H7CQTLDPx16FpmKpsR8Q5A.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*H7CQTLDPx16FpmKpsR8Q5A.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*H7CQTLDPx16FpmKpsR8Q5A.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*H7CQTLDPx16FpmKpsR8Q5A.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*H7CQTLDPx16FpmKpsR8Q5A.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*H7CQTLDPx16FpmKpsR8Q5A.png 1100w, https://miro.medium.com/v2/resize:fit:2000/format:webp/1*H7CQTLDPx16FpmKpsR8Q5A.png 2000w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*H7CQTLDPx16FpmKpsR8Q5A.png 640w, https://miro.medium.com/v2/resize:fit:720/1*H7CQTLDPx16FpmKpsR8Q5A.png 720w, https://miro.medium.com/v2/resize:fit:750/1*H7CQTLDPx16FpmKpsR8Q5A.png 750w, https://miro.medium.com/v2/resize:fit:786/1*H7CQTLDPx16FpmKpsR8Q5A.png 786w, https://miro.medium.com/v2/resize:fit:828/1*H7CQTLDPx16FpmKpsR8Q5A.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*H7CQTLDPx16FpmKpsR8Q5A.png 1100w, https://miro.medium.com/v2/resize:fit:2000/1*H7CQTLDPx16FpmKpsR8Q5A.png 2000w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px" /></picture></div></div><figcaption class="oq ff or ob oc os ot bf b bg z du">The 4 initial Postcard variants: Icons, Destination specific, Taxonomy, and Default.</figcaption></figure></div></div></div><div class="ab cb"><div class="ci bh fz ga gb gc"><h2 id="654a" class="qg ov gu bf ow qh qi dy pa qj qk ea pe nh ql qm qn nl qo qp qq np qr qs qt qu bk"><strong class="al">Formatted Translations</strong></h2><p id="aab3" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">We take the listing of each booking request and fetch the city and country from the listing service and check to see if that destination was in our curated set of formatted destinations loaded into our i18n service. We then take the best fitting artwork and embed the localized destination text on it to generate the final postcard. If we don’t get a translation back, we fall back to serving the postcard without text.</p><h2 id="3a9c" class="qg ov gu bf ow qh qi dy pa qj qk ea pe nh ql qm qn nl qo qp qq np qr qs qt qu bk"><strong class="al">Performance — Async Postcard Creation Flow</strong></h2><p id="50ec" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">Putting a localized destination and a Belo icon onto artwork is a time-consuming operation given the high resolution artwork we used. We knew the image processing flow could take over 8 seconds on average to process an image so we needed to come up with a way to make our postcard API respond quickly. We also wanted to transfer these generated postcards into our primary image storage so we could leverage our existing media serving infrastructure, which introduced an additional 1–2 seconds of latency.</p><p id="8ce6" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">In order to still be performant, we went with a partly asynchronous approach where, during the live in product request, we only serve postcards that we’ve already generated and stored internally. If there was a request for a new postcard, we would instead return a fallback postcard and publish an event to a Kafka queue where an async consumer would call the processing service, wait for the asset to be generated and then transfer it into our system to be used for future requests.</p><p id="3571" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">As shown in the diagram below, we fetched the listing information and taxonomy information in parallel before computing the best matching artwork for the trip. Based on a pattern in how the postcards are stored, we would check in our media service to see if the postcard was generated already before either returning the card if it was found or kicking off the asynchronous flow if it was not found. At that point, our media service’s Kafka consumer would complete the flow by transforming the asset into a postcard and storing it in our system.</p><figure class="oe of og oh oi nv ob oc paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="ob oc rg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*IFqd3spjdRk57Lq8 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*IFqd3spjdRk57Lq8 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*IFqd3spjdRk57Lq8 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*IFqd3spjdRk57Lq8 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*IFqd3spjdRk57Lq8 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*IFqd3spjdRk57Lq8 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*IFqd3spjdRk57Lq8 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*IFqd3spjdRk57Lq8 640w, https://miro.medium.com/v2/resize:fit:720/0*IFqd3spjdRk57Lq8 720w, https://miro.medium.com/v2/resize:fit:750/0*IFqd3spjdRk57Lq8 750w, https://miro.medium.com/v2/resize:fit:786/0*IFqd3spjdRk57Lq8 786w, https://miro.medium.com/v2/resize:fit:828/0*IFqd3spjdRk57Lq8 828w, https://miro.medium.com/v2/resize:fit:1100/0*IFqd3spjdRk57Lq8 1100w, https://miro.medium.com/v2/resize:fit:1400/0*IFqd3spjdRk57Lq8 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="oq ff or ob oc os ot bf b bg z du">Architecture Diagram of the Backend Postcard Generation flow</figcaption></figure><h2 id="3da7" class="qg ov gu bf ow qh qi dy pa qj qk ea pe nh ql qm qn nl qo qp qq np qr qs qt qu bk"><strong class="al">Pre-generation</strong></h2><p id="8afe" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">We wanted to generate as many of the postcards as possible before the launch. If the postcard hasn’t been generated when a guest books a group trip, everyone on the booking will see the default, generic postcard. Our data science team helped determine top destinations, and we ran those inputs through our postcard generation pipeline to pre-generate as many postcards as possible and minimize the chance of falling back to a default postcard. Within a week of launching, more than 90% of trips had a custom postcard instead of a default and we inched closer to generating a postcard for all trips in the months after.</p><figure class="oe of og oh oi nv ob oc paragraph-image"><div role="button" tabindex="0" class="ol om fj on bh oo"><div class="ob oc rh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*zYbErXfmHNlnCf0M 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*zYbErXfmHNlnCf0M 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*zYbErXfmHNlnCf0M 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*zYbErXfmHNlnCf0M 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*zYbErXfmHNlnCf0M 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*zYbErXfmHNlnCf0M 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*zYbErXfmHNlnCf0M 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*zYbErXfmHNlnCf0M 640w, https://miro.medium.com/v2/resize:fit:720/0*zYbErXfmHNlnCf0M 720w, https://miro.medium.com/v2/resize:fit:750/0*zYbErXfmHNlnCf0M 750w, https://miro.medium.com/v2/resize:fit:786/0*zYbErXfmHNlnCf0M 786w, https://miro.medium.com/v2/resize:fit:828/0*zYbErXfmHNlnCf0M 828w, https://miro.medium.com/v2/resize:fit:1100/0*zYbErXfmHNlnCf0M 1100w, https://miro.medium.com/v2/resize:fit:1400/0*zYbErXfmHNlnCf0M 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="oq ff or ob oc os ot bf b bg z du">Dashboard numbers on Postcard template selection type and postcard usage hit rate. <em class="qw">For illustrative purposes only, not real data.</em></figcaption></figure><h1 id="8d91" class="ou ov gu bf ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr bk">Conclusion</h1><p id="69f5" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">Creating postcards was a massive effort that required collaboration across multiple engineering, product, design, and data science teams to improve Airbnb’s group travel feature. Our frontline insights team continues to receive positive social media and external feedback on this update that adds delight to joining a group trip.</p><p id="2038" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">The solution highlights the importance of having the right internal tooling, image and text processing capabilities, and destination matching logic for solving something at Airbnb’s scale.</p><p id="1aaf" class="pw-post-body-paragraph mw mx gu my b mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt gn bk">Postcards is one of the first major image processing use cases that the Media team built to support a new Airbnb feature. It highlights the power of media capabilities and innovative features we can build with them. If you like the type of work we do at Airbnb, please contact us &amp; check out our <a class="af nu" href="https://careers.airbnb.com" rel="noopener ugc nofollow" target="_blank">careers page</a>!</p><h1 id="5af1" class="ou ov gu bf ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr bk">Acknowledgments</h1><p id="624d" class="pw-post-body-paragraph mw mx gu my b mz ps nb nc nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt gn bk">Thanks to the following engineers who helped to build this feature: Alan Wright, Aditya Punjani, Bill Lovotti, Jessica Chen, Miguel Jimenez</p></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/building-postcards-for-airbnb-scale-dfe0b71b12ec</link>
      <guid>https://medium.com/airbnb-engineering/building-postcards-for-airbnb-scale-dfe0b71b12ec</guid>
      <pubDate>Wed, 28 Aug 2024 18:01:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Personal Data Classification]]></title>
      <description><![CDATA[<div><div></div><p id="99fa" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">An Important Foundation For Security, Privacy, and Compliance at Airbnb</p><figure class="ny nz oa ob oc od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fk og bg oh"><div class="nv nw nx"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*UWzBCWsCQWkrzLjV 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*UWzBCWsCQWkrzLjV 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*UWzBCWsCQWkrzLjV 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*UWzBCWsCQWkrzLjV 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*UWzBCWsCQWkrzLjV 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*UWzBCWsCQWkrzLjV 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*UWzBCWsCQWkrzLjV 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*UWzBCWsCQWkrzLjV 640w, https://miro.medium.com/v2/resize:fit:720/0*UWzBCWsCQWkrzLjV 720w, https://miro.medium.com/v2/resize:fit:750/0*UWzBCWsCQWkrzLjV 750w, https://miro.medium.com/v2/resize:fit:786/0*UWzBCWsCQWkrzLjV 786w, https://miro.medium.com/v2/resize:fit:828/0*UWzBCWsCQWkrzLjV 828w, https://miro.medium.com/v2/resize:fit:1100/0*UWzBCWsCQWkrzLjV 1100w, https://miro.medium.com/v2/resize:fit:1400/0*UWzBCWsCQWkrzLjV 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="ea0a" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj"><strong class="mz gw">By:</strong> <a class="af oj" href="https://www.linkedin.com/in/samhykim/" rel="noopener ugc nofollow" target="_blank">Sam Kim</a>, <a class="af oj" href="https://www.linkedin.com/in/alexander-klimov-8521599/" rel="noopener ugc nofollow" target="_blank">Alex Klimov</a>, <a class="af oj" href="https://www.linkedin.com/in/woodyzhou/" rel="noopener ugc nofollow" target="_blank">Woody Zhou</a>, <a class="af oj" href="https://www.linkedin.com/in/sylviatomiyama/" rel="noopener ugc nofollow" target="_blank">Sylvia Tomiyama</a>, <a class="af oj" href="https://www.linkedin.com/in/aniket-arondekar/" rel="noopener ugc nofollow" target="_blank">Aniket Arondekar</a>, <a class="af oj" href="https://www.linkedin.com/in/ansumanacharya/" rel="noopener ugc nofollow" target="_blank">Ansuman Acharya</a></p><h1 id="1ad9" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Introduction</h1><p id="378b" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj"><strong class="mz gw">Airbnb is built on trust</strong>. One key way we maintain trust with our community is by ensuring that personal data is handled with care, in a manner that meets security, privacy, and compliance requirements. Understanding where and what personal data exists is foundational to this.</p><p id="8f4d" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">Over the past several years, we’ve built our own data classification system that adapts to the needs of our data ecosystem, to streamline our processes, and further unlock our ability to protect the data entrusted to Airbnb. This was made possible by many teams working closely to achieve this overarching, shared objective. Information Security, Privacy, Data Governance, Legal, and Engineering collaborated to tackle this problem holistically to produce a unified data identification and classification strategy across all data stores.</p><p id="1859" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">In this blog, we will shed light on the complexities of how data classification works at Airbnb, what measurements we set to assess the quality, performance, and accuracy of the systems involved, and the important considerations when building a data classification system. We hope to share insights for others that are facing similar challenges and to provide a framework for how data classification systems can be built at scale.</p><h1 id="ffca" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">The Complexities of Data Classification at Airbnb</h1><p id="a6e6" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">Data classification is the process of identifying where data exists and then organizing, detecting, and annotating that data based on a taxonomy. At Airbnb, we have established a Personal Data Taxonomy Council to define the taxonomy for personal data and to refine it over time. This taxonomy breaks down personal data into various data elements that are relevant for our ecosystem such as email address, physical address, and guest names. Once data is annotated with its applicable personal data element(s), various enforcement systems use these annotations to ensure personal data is handled according to our Security and Privacy policies. In this blog post, we will focus primarily on the data classification workflow and not each type of enforcement use case.</p><p id="6717" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">The workflow can be classified into three pillars:</p><ul class=""><li id="fcb3" class="mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu pn po pp bj"><strong class="mz gw">Catalog</strong>: What data do we have?</li><li id="ed9d" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj"><strong class="mz gw">Detection</strong>: What data do we suspect is personal data?</li><li id="d5fb" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj"><strong class="mz gw">Reconciliation</strong>: Which classification do we choose?</li></ul><figure class="ny nz oa ob oc od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fk og bg oh"><div class="nv nw nx"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*AEfqygsxW532DeEe 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*AEfqygsxW532DeEe 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*AEfqygsxW532DeEe 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*AEfqygsxW532DeEe 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*AEfqygsxW532DeEe 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*AEfqygsxW532DeEe 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*AEfqygsxW532DeEe 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*AEfqygsxW532DeEe 640w, https://miro.medium.com/v2/resize:fit:720/0*AEfqygsxW532DeEe 720w, https://miro.medium.com/v2/resize:fit:750/0*AEfqygsxW532DeEe 750w, https://miro.medium.com/v2/resize:fit:786/0*AEfqygsxW532DeEe 786w, https://miro.medium.com/v2/resize:fit:828/0*AEfqygsxW532DeEe 828w, https://miro.medium.com/v2/resize:fit:1100/0*AEfqygsxW532DeEe 1100w, https://miro.medium.com/v2/resize:fit:1400/0*AEfqygsxW532DeEe 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv fe pw nv nw px py be b bf z dt">Personal Data Classification Flow</figcaption></figure><p id="6719" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">Let’s dig deeper into how each of these form the backbone of data classification.</p><h1 id="6a96" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Catalog</h1><p id="dca6" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">Cataloging involves building a dynamic, accurate, and scalable system to first <em class="pz">identify</em> where data exists and then <em class="pz">organize</em> the whole inventory. Cataloging is akin to mapping the data landscape or organizing a library. It involves dynamically discovering new data, enriching it with metadata from various sources, and manually inputting information. This process is crucial for enforcing data policies, accurately classifying data, and assigning it to the correct owners.</p><ul class=""><li id="a5ba" class="mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu pn po pp bj"><strong class="mz gw">Automated and Dynamic Discovery:</strong> Automation makes the cataloging process scalable and efficient. For the variety of data stores that Airbnb uses, such as production and analytical databases, object stores, and cloud storage, our catalogs connect to them and dynamically fetch the full inventory of data. Either through stream or batch processing, they dynamically update to reflect new and changed data. This ensures the catalog is a reliable and accurate source of truth.</li><li id="0e14" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj"><strong class="mz gw">Complexity and Diversity in Data Sources:</strong> The challenge of cataloging stems from the variety and complexity of data sources, including different formats and locations. Our cataloging systems fetch metadata in several ways: through direct API calls or by crawling schemas in formats like thrift, JSON, yaml, or config files, accommodating the diverse nature of modern data storage.</li></ul><p id="22ad" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">For search and discovery, many of our data entities are surfaced in the data management platform, <a class="af oj" rel="noopener" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19">Metis</a>. This helps the data owners quickly answer questions such as which data contains personal data, who owns the data, and which controls are in place.</p><figure class="ny nz oa ob oc od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fk og bg oh"><div class="nv nw nx"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*_Xnn1fMSfTgvUmMn 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*_Xnn1fMSfTgvUmMn 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*_Xnn1fMSfTgvUmMn 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*_Xnn1fMSfTgvUmMn 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*_Xnn1fMSfTgvUmMn 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*_Xnn1fMSfTgvUmMn 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*_Xnn1fMSfTgvUmMn 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*_Xnn1fMSfTgvUmMn 640w, https://miro.medium.com/v2/resize:fit:720/0*_Xnn1fMSfTgvUmMn 720w, https://miro.medium.com/v2/resize:fit:750/0*_Xnn1fMSfTgvUmMn 750w, https://miro.medium.com/v2/resize:fit:786/0*_Xnn1fMSfTgvUmMn 786w, https://miro.medium.com/v2/resize:fit:828/0*_Xnn1fMSfTgvUmMn 828w, https://miro.medium.com/v2/resize:fit:1100/0*_Xnn1fMSfTgvUmMn 1100w, https://miro.medium.com/v2/resize:fit:1400/0*_Xnn1fMSfTgvUmMn 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv fe pw nv nw px py be b bf z dt">Catalog UI</figcaption></figure><h1 id="6069" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Detection</h1><p id="b901" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">For personal data detection, we use the in-house automated detection service in our <a class="af oj" rel="noopener" href="https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-1-c74909328e08">Data Protection Platform</a> which was built to protect data in compliance with global regulations and security requirements. As our own taxonomy grows, we have expanded our capabilities and made the service easily extensible to detect all other types of personal data elements and personal Airbnb IDs.</p><h2 id="fc00" class="qa ol gv be om qb qc dx oq qd qe dz ou ni qf qg qh nm qi qj qk nq ql qm qn qo bj">Detection engine</h2><p id="7d59" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">For each data entity stored in the catalogs, scanning jobs are scheduled through a message queue, which then samples data and runs through our list of classifiers. Recognizing the need for periodic classifier updates, the detection engine was designed for simplicity and flexibility. Since its inception, our detection engine has upgraded to include additional steps and adopted the approach of configuration-driven development. The majority of the logic of the detection engine has been rewritten as simpler configurations to increase the speed of iterating on existing classifiers, improve testing, and enable quick development of new features.</p><p id="0998" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">The detection engine can be seen as a pipeline, which involves the scanner, validator, and thresholding.</p><figure class="ny nz oa ob oc od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fk og bg oh"><div class="nv nw qp"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*a5wuHdZPYea_rIK1PuXnfA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*a5wuHdZPYea_rIK1PuXnfA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*a5wuHdZPYea_rIK1PuXnfA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*a5wuHdZPYea_rIK1PuXnfA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*a5wuHdZPYea_rIK1PuXnfA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*a5wuHdZPYea_rIK1PuXnfA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*a5wuHdZPYea_rIK1PuXnfA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*a5wuHdZPYea_rIK1PuXnfA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*a5wuHdZPYea_rIK1PuXnfA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*a5wuHdZPYea_rIK1PuXnfA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*a5wuHdZPYea_rIK1PuXnfA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*a5wuHdZPYea_rIK1PuXnfA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*a5wuHdZPYea_rIK1PuXnfA.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*a5wuHdZPYea_rIK1PuXnfA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv fe pw nv nw px py be b bf z dt">Detection Engine</figcaption></figure><p id="36e1" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj"><strong class="mz gw">Scanner:</strong> The scanner classifies personal data using metadata and content, employing methods like regex for emails and keyword lists for cities, and advanced machine learning models for complex data types requiring contextual understanding.</p><p id="e2ca" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj"><strong class="mz gw">Validator:</strong> Sampled data matching a scanner undergoes a customizable validation step to enhance classifier accuracy, verifying details like latitude/longitude ranges or custom ciphertexts from encryption services.</p><p id="a67d" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj"><strong class="mz gw">Thresholding:</strong> To reduce noise, thresholding is applied before storing results, varying by data structure type (e.g., matched rows vs. findings in a document) and set based on historical data frequency and criticality.</p><p id="5ee0" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">With the revamped pipeline, this has resulted in a significant decrease in false positive findings and reduced the burden on data owners to verify every result, which has historically impeded their productivity.</p><h1 id="1dc7" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Reconciliation</h1><p id="f86e" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">Not every detection surfaced may be correct or, in other cases, more context may be required. Therefore, we employ a <strong class="mz gw">human-in-the-loop</strong> <strong class="mz gw">strategy</strong>: where data owners confirm the classifications. This step is critical in ensuring these classifications are correct before any data policies are automatically enforced to protect our data.</p><h2 id="534e" class="qa ol gv be om qb qc dx oq qd qe dz ou ni qf qg qh nm qi qj qk nq ql qm qn qo bj">Automated Notifications</h2><p id="d3cb" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">For compliance, we have an automated notification system that issues tickets whenever personal data is detected. These get surfaced to the appropriate data or service owners with strict SLAs.</p><p id="ca1c" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">For data entities that have schemas defined in code, such as transactional tables from production services (online), Amazon S3 buckets, or tables that are exported to our data warehouse, we assist the developers by automatically creating code changes that update their table schemas with the detected personal data elements.</p><h2 id="d570" class="qa ol gv be om qb qc dx oq qd qe dz ou ni qf qg qh nm qi qj qk nq ql qm qn qo bj">Enforcing resolution</h2><p id="3f44" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">To enforce resolution on these tickets, tables are automatically access controlled in the data warehouse when the tickets are not resolved within SLA. Additionally reviews are conducted to ensure our classifications are correct for data where its handling requirements apply.</p><p id="637b" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">Tracking actions taken on these tickets for when personal data is detected has been important to assess the quality of our data classification flow and to keep an audit trail of past detections. It also highlights points of friction developers face when resolving these tickets. The investments we have made in this area have continued to improve the process and reduce the time needed to resolve tickets each year.</p><h1 id="ef69" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Assessing Quality of a Data Classification System</h1><p id="1ef5" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">Because of the complexity of the system and its sub-components, this presented a unique challenge as we strive to define what quality means for the entire system. To build with the long-term in mind, we evaluated how well our entire data classification system functions as a whole.</p><p id="d884" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">We’ve set up measurements to assess quality of our data classification in three categories:</p><p id="5e3d" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj"><strong class="mz gw">Recall</strong>: This measures our coverage and ability to <strong class="mz gw">not miss</strong> where personal data may exist, crucial for protecting the stored personal data. We assess recall through:</p><ul class=""><li id="6ac4" class="mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu pn po pp bj">Number of data entities integrated in the data classification system</li><li id="e98d" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj">Volume of personal data that exists from all different sources</li><li id="7bc1" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj">Types of personal data being annotated and automatically detected against our taxonomy</li></ul><p id="f459" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj"><strong class="mz gw">Precision</strong>: This evaluates the accuracy of our data classifications, vital for data owners tagging their data. High precision minimizes tagging friction. Precision is measured by:</p><ul class=""><li id="d153" class="mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu pn po pp bj">Tracking false positive rates of classifiers for each type of personal data</li><li id="ca36" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj">Tracking ticket resolutions made by data owners, which also aids in understanding nuanced classification cases</li></ul><p id="90e6" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj"><strong class="mz gw">Speed</strong>: This gauges the efficiency of identifying and classifying personal data, aiming to minimize compliance risks. Speed is measured by:</p><ul class=""><li id="18a5" class="mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu pn po pp bj">Time it takes for the detection engine for scanning new data entities</li><li id="c4a7" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj">Time it takes for data owners to reconcile classifications and resolve tickets</li><li id="0178" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj">The frequency of data tagging at creation by data owners</li></ul><p id="dfe1" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">These measurements ensure our data classification system is effective, accurate, and efficient, safeguarding our personal data.</p><h1 id="25f1" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Considerations for Building a Data Classification System</h1><p id="fc42" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">It is important to be aware of issues that may be present with the outlined approach in general. Below are some challenges that we’ve considered when building a data classification system:</p><ul class=""><li id="b672" class="mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu pn po pp bj"><strong class="mz gw">Post-Processing Classification</strong>: The outlined approach mostly relies on post-processing classification, which means that schema information is added after data has been collected and stored. In a modern data world where data and metadata are constantly changing, post-processing cannot catch up with data evolution.</li><li id="9836" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj"><strong class="mz gw">Inconsistent Classifications:</strong> Data generally flows from online to offline through ETL (extract, transform, and load) processes, and then reverse ETLing back to the online world. However, data classification that is performed independently in both worlds can lead to inconsistent classifications.</li><li id="f194" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj"><strong class="mz gw">Waste of Process Cost</strong>: Duplicate annotations can be made for the same data in the online and offline domains, which might result in increased costs for data classification processes.</li></ul><p id="476c" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">To address these challenges, we describe the process of “<strong class="mz gw">shifting left</strong>” with data classification and how we started to push developers to annotate their data at the beginning of the data lifecycle.</p><h1 id="0227" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Shifting Left</h1><p id="9ce5" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">Instead of thinking about governance and data classification as an activity that happens post-hoc, we’ve started to embed the annotation process directly into data schemas as they are being created and updated. This enables us to:</p><ul class=""><li id="7e21" class="mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu pn po pp bj"><strong class="mz gw">Shift Classification from Data to Schema</strong>: The schema annotation process takes place earlier in the data lifecycle at the point of data collection. This keeps annotations updated as data evolves and ensures data is annotated before collection and consumption, allowing for immediate policy enforcement.</li><li id="927a" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj"><strong class="mz gw">Shift Classification from Offline to Online</strong>: Traditionally done offline, data classification is now integrated into production services, ensuring data is structured and formatted correctly from the start. Leveraging data lineage information enables automated annotation, reducing the need for manual effort and lowering process costs.</li><li id="e6fd" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj"><strong class="mz gw">Shift from Data Steward to Data Owner</strong>: Oftentimes, stewardship, or the responsible management and oversight, of the data is conducted by people who are downstream of data creation, such as data consumers or governance professionals. This change shifts stewardship to the data producers, merging the roles of data steward and data owner. This empowers the team that owns the data to manage it more effectively and scale operations.</li></ul><h1 id="f14c" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Schema Annotation Enforcement</h1><p id="9d2c" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">Focusing on our most crucial online data, we have started executing on shifting left by directly integrating with our internal schema definition language that is known for its annotation capabilities. We now mandate that developers include personal data annotations at the source when creating new data models, providing guidance on accurate tagging. This requirement is enforced through checks that run in our CI/CD pipelines which:</p><ul class=""><li id="bf4e" class="mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu pn po pp bj"><strong class="mz gw">Automatically suggest data elements</strong>: Based on the schema’s metadata, we automatically detect the data elements for all fields defined in the schema with our detection service.</li><li id="8e05" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj"><strong class="mz gw">Validate data elements</strong>: Annotations are validated against our own taxonomy and schemas are enforced and all fields are annotated, even when it is not considered personal.</li><li id="8711" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj"><strong class="mz gw">Warn about downstream impact</strong>: We notify data owners when annotations can impact downstream services such as offline data pipelines and direct them to the proper resources for handling.</li></ul><figure class="ny nz oa ob oc od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of fk og bg oh"><div class="nv nw nx"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*okQZ72FfzlqYPzhW 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*okQZ72FfzlqYPzhW 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*okQZ72FfzlqYPzhW 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*okQZ72FfzlqYPzhW 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*okQZ72FfzlqYPzhW 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*okQZ72FfzlqYPzhW 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*okQZ72FfzlqYPzhW 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*okQZ72FfzlqYPzhW 640w, https://miro.medium.com/v2/resize:fit:720/0*okQZ72FfzlqYPzhW 720w, https://miro.medium.com/v2/resize:fit:750/0*okQZ72FfzlqYPzhW 750w, https://miro.medium.com/v2/resize:fit:786/0*okQZ72FfzlqYPzhW 786w, https://miro.medium.com/v2/resize:fit:828/0*okQZ72FfzlqYPzhW 828w, https://miro.medium.com/v2/resize:fit:1100/0*okQZ72FfzlqYPzhW 1100w, https://miro.medium.com/v2/resize:fit:1400/0*okQZ72FfzlqYPzhW 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pv fe pw nv nw px py be b bf z dt">Schema Annotation Enforcement</figcaption></figure><p id="d1b4" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">While shifting left has significantly helped to push classifying data earlier and increase coverage of schema annotations, <strong class="mz gw">this does not discount the importance of the rest of the classification process</strong>. Classifications that happen post-process are necessary for instance in cases where data storages that do not include well-defined schemas. Therefore, continued investments are still needed in detection and reconciliation to cover areas that cannot be shifted left and to verify annotations that may have already been made by owners as a second layer of protection.</p><h1 id="c4e9" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Conclusion/Lessons Learned</h1><p id="2e83" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">The Airbnb data classification framework has been successful in advancing data management, security, and privacy. Reflecting on the journey, it has offered invaluable insights that have shaped our methodologies. Key takeaways include:</p><ul class=""><li id="2d35" class="mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu pn po pp bj">Adopting a unified strategy for classifying online and offline personal data to streamline processes</li><li id="5bdc" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj">Implementing a ‘Shift Left’ approach to engage with data owners early in the development cycle</li><li id="482a" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj">Addressing classification uncertainties through clear guidelines and decision-making</li><li id="5f0d" class="mx my gv mz b na pq nc nd ne pr ng nh ni ps nk nl nm pt no np nq pu ns nt nu pn po pp bj">Enhancing education and training initiatives for data owners and consumers</li></ul><p id="bd86" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">As the data landscape continues to rapidly change, these lessons will guide future data classification efforts and ensure continued trust and protection of customer data.</p><h1 id="8964" class="ok ol gv be om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph bj">Acknowledgements</h1><p id="08ea" class="pw-post-body-paragraph mx my gv mz b na pi nc nd ne pj ng nh ni pk nk nl nm pl no np nq pm ns nt nu go bj">Our data classification strategy has evolved over many years and we’ve been able to quickly adapt and iterate thanks to our decision to build an in-house solution. Security, privacy, and compliance are of utmost importance at Airbnb, and this work would not be possible without the contribution of many of our cross-functional partners and leaders.</p><p id="a941" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">This includes, but are not limited to: Bill Meng, Aravind Selvan, Juan Tamayo, Xiao Chen, Pinyao Guo, Wendy Jin, Liam McInerney, Pat Moynahan, Gabriel Gejman, Marc Blanchou, Brendon Lynch, and many others.</p><p id="9b1d" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj">If this type of work interests you, check out some of our related positions at <a class="af oj" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">Careers at Airbnb</a> or check out more resources in the <a class="af oj" href="https://medium.com/airbnb-engineering" rel="noopener">Airbnb Tech Blog</a>!</p><p id="cb75" class="pw-post-body-paragraph mx my gv mz b na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu go bj"><em class="pz">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/personal-data-classification-2d816d8ea516</link>
      <guid>https://medium.com/airbnb-engineering/personal-data-classification-2d816d8ea516</guid>
      <pubDate>Mon, 19 Aug 2024 18:43:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Apache Flink® on Kubernetes]]></title>
      <description><![CDATA[<div><div></div><p id="e165" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj"><strong class="mx gu">Airbnb’s Use of A New Flink platform evolved from Apache Hadoop® Yarn</strong></p><figure class="nw nx ny nz oa ob nt nu paragraph-image"><div role="button" tabindex="0" class="oc od fi oe bg of"><div class="nt nu nv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*doLL9u-uICR6OPtdXYE4sQ.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="aa15" class="oh oi gt be oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe bj">Introduction</h1><p id="23ea" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">At Airbnb, <a class="af pk" href="https://nightlies.apache.org/flink/flink-docs-stable/" rel="noopener ugc nofollow" target="_blank">Apache Flink</a> was introduced in 2018 as a supplementary solution for stream processing. It ran alongside Apache Spark™ Streaming for several years before transitioning to become the primary stream processing platform. In this blog post, we will delve into the evolution of Flink architecture at Airbnb and compare our prior <a class="af pk" href="https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html" rel="noopener ugc nofollow" target="_blank">Hadoop Yarn</a> platform with the current <a class="af pk" href="https://kubernetes.io/" rel="noopener ugc nofollow" target="_blank">Kubernetes</a>-based architecture. Additionally, we will discuss the efforts undertaken throughout the migration process and explore the challenges that arose during this journey. In the end we will summarize the impact, learnings along the way and future plans.</p><h1 id="6d36" class="oh oi gt be oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe bj">Architecture Evolution</h1><p id="99b5" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">The evolution of Airbnb’s streaming processing architecture based on Apache Flink can be categorized into three distinct phases:</p><h2 id="c5b8" class="pl oi gt be oj pm pn dx on po pp dz or ng pq pr ps nk pt pu pv no pw px py pz bj">Phase One: Flink jobs operated on Hadoop Yarn with Apache Airflow serving as the job scheduler.</h2><p id="628a" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">Around 2018, several teams at Airbnb adopted Flink as their streaming processing engine, mainly due to its superior low-latency capabilities compared to Spark Streaming. During this period, Flink jobs were running on Hadoop Yarn, and <a class="af pk" href="https://airflow.apache.org/" rel="noopener ugc nofollow" target="_blank">Airflow</a> was employed as the workflow manager for task scheduling and dependency management.</p><p id="5f92" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">The selection of Airflow as the workflow manager was largely influenced by its widespread use in addressing various job scheduling needs, as there were no other user-friendly open-source alternatives readily available at that time. Each team was responsible for handling their Airflow Directed Acyclic Graphs (DAGs), job source code, and the requisite dependency JARs. Typically, Flink JAR files were locally built before deployment to Amazon S3.</p><figure class="nw nx ny nz oa ob nt nu paragraph-image"><div role="button" tabindex="0" class="oc od fi oe bg of"><div class="nt nu qa"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*QDslbwDXfytHjwcnTm9DfQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*QDslbwDXfytHjwcnTm9DfQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*QDslbwDXfytHjwcnTm9DfQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*QDslbwDXfytHjwcnTm9DfQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*QDslbwDXfytHjwcnTm9DfQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*QDslbwDXfytHjwcnTm9DfQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*QDslbwDXfytHjwcnTm9DfQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*QDslbwDXfytHjwcnTm9DfQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*QDslbwDXfytHjwcnTm9DfQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*QDslbwDXfytHjwcnTm9DfQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*QDslbwDXfytHjwcnTm9DfQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*QDslbwDXfytHjwcnTm9DfQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*QDslbwDXfytHjwcnTm9DfQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*QDslbwDXfytHjwcnTm9DfQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="3e37" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">The architecture catered to our requirements during that period with a limited range of use cases.</p><p id="1797" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">From 2019 onwards, Apache Flink gained significant traction at Airbnb, replacing Spark Streaming as the primary stream processing platform. With the scaling in usage of Flink we encountered various challenges and limitations in this architecture. To begin with, Airflow’s batch-oriented design, relying on polling intervals, did not match Airbnb’s needs, and we experienced significant delays in job start and failure recovery, often causing SLA violations for low-latency use cases. Airflow also caused a singleton issue as duplicate job submissions occasionally occur due to race conditions among Airflow workers and user operations not following expected patterns. Besides, Airflow’s Directed Acyclic Graph (DAG) structure is complex and does not function well with some of Airbnb’s streaming use cases. We also encountered engineering context mismatch in this architecture: product engineers might find themselves unfamiliar with Apache Airflow and Hadoop, resulting in a steep learning curve when setting up new Apache Flink jobs.</p><p id="ee97" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">To tackle the above technical and operational challenges, we started to explore new possibilities. Our initial step involved replacing Airflow with a customized lightweight streaming job scheduler, marking the inception of Phase Two.</p><h2 id="8147" class="pl oi gt be oj pm pn dx on po pp dz or ng pq pr ps nk pt pu pv no pw px py pz bj">Phase Two: Flink jobs operated on Hadoop Yarn, with a lightweight streaming job scheduler.</h2><figure class="nw nx ny nz oa ob nt nu paragraph-image"><div role="button" tabindex="0" class="oc od fi oe bg of"><div class="nt nu qb"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*J6I2yYCug7akgfERaNEnOg.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*J6I2yYCug7akgfERaNEnOg.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*J6I2yYCug7akgfERaNEnOg.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*J6I2yYCug7akgfERaNEnOg.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*J6I2yYCug7akgfERaNEnOg.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*J6I2yYCug7akgfERaNEnOg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*J6I2yYCug7akgfERaNEnOg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*J6I2yYCug7akgfERaNEnOg.png 640w, https://miro.medium.com/v2/resize:fit:720/1*J6I2yYCug7akgfERaNEnOg.png 720w, https://miro.medium.com/v2/resize:fit:750/1*J6I2yYCug7akgfERaNEnOg.png 750w, https://miro.medium.com/v2/resize:fit:786/1*J6I2yYCug7akgfERaNEnOg.png 786w, https://miro.medium.com/v2/resize:fit:828/1*J6I2yYCug7akgfERaNEnOg.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*J6I2yYCug7akgfERaNEnOg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*J6I2yYCug7akgfERaNEnOg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="03bb" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">At a high level, Airflow was replaced by a lightweight streaming job scheduler operating on Kubernetes. The job scheduler contains a master node and a pool of worker nodes:</p><p id="aaf5" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">The master node is responsible for managing the metadata of all Flink jobs and ensuring the proper life cycle of each worker node. This includes tasks such as parsing user-provided job configurations, synchronizing metadata and job statuses with Apache Zookeeper™, and ensuring that worker nodes consistently maintain their expected states.</p><p id="9ed6" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">A worker node is responsible for handling the dependencies and life cycle of a single Flink job. Workers package the necessary dependencies, submit the Flink job to Hadoop Yarn, continuously monitor its status, and in the event of a failure, it triggers an immediate restart.</p><p id="9929" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">The Phase 2 design resulted in faster turnaround time and reduced downtime during job restarts. It also resolved single point of failure issues with Zookeeper.</p><p id="e04e" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">As usage of Flink grew, we encountered new challenges in Phase Two:</p><ul class=""><li id="a815" class="mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns qc qd qe bj">Lack of CI/CD: Flink developers had to devise their own version control strategies.</li><li id="df82" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Absence of native secrets management: There is no vanilla secrets management on Hadoop Yarn.</li><li id="21d6" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Limited resource and dependency isolation: Each supported Flink version had to be manually preinstalled on the Yarn cluster. While Yarn’s resource queues could provide some level of resource isolation, job-level isolation was absent.</li><li id="577e" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Service Discovery complexity: As more use cases were onboarded, each potentially requiring access to various internal Airbnb services, configuring service access on Yarn proved to be cumbersome. It forced a binary choice between enabling service access for the entire cluster or none at all.</li><li id="3b0d" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Monitoring and debugging challenges: Managing and maintaining the logging pipeline and SSH access became non-trivial tasks on a multi-tenant Yarn cluster.</li><li id="f235" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Ongoing complexity and dependencies: Although the Flink job scheduler was lightweight compared to Airflow, it introduced additional complexities.</li></ul><h2 id="2721" class="pl oi gt be oj pm pn dx on po pp dz or ng pq pr ps nk pt pu pv no pw px py pz bj">Phase Three (current state): Flink jobs run on Kubernetes, and the job scheduler is eliminated.</h2><p id="2c34" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj"><a class="af pk" href="https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes/" rel="noopener ugc nofollow" target="_blank">Deploying Flink on Kubernetes</a> allows direct Flink deployment on a running Kubernetes cluster. With this integration we can explore enabling efficient autoscaling and the Kubernetes operator to simplify the management of Flink jobs and clusters.</p><p id="fecc" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">Flink on Kubernetes offers several advantages over Hadoop Yarn addressing the above challenges:</p><ul class=""><li id="6cb1" class="mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns qc qd qe bj">Developer experience: Standardized by integrating with the existing CI/CD systems.</li><li id="8a41" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Secrets Management: With Flink on Kubernetes, each Flink job can securely store its own secrets within the pods. This provides a more secure way to manage sensitive information.</li><li id="7081" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Isolated Environment: Jobs running on Flink on Kubernetes benefit from isolation at both the resource and dependency levels. Each job can run on its dedicated Flink version if supported by its image, allowing for better management of dependencies.</li><li id="e2dd" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Enhanced Monitoring: Integration with Airbnb’s pre-defined logging and metric sidecars on Kubernetes simplifies setup and improves monitoring. This enables detailed insights into individual pods and rate limiting for logging per pod, making it easier to track and troubleshoot issues.</li><li id="58c7" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Service Discovery: Flink jobs now adhere to Airbnb’s standardized approach for service discovery, using the cluster mesh. This ensures consistent and reliable communication between services.</li><li id="4641" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qc qd qe bj">Simplified SSH access: Users with the appropriate permissions can now SSH into the Flink pod without the need for an SSH tunnel. This provides greater flexibility and control over SSH permissions per job.</li></ul><p id="62bd" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">Additionally, we’ve observed an increasing level of Kubernetes support and adoption within the Flink community, which increased our confidence in running Flink on Kubernetes.</p><p id="167c" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">It’s worth mentioning that Kubernetes brings its own risks and limitations. For instance, a single Flink task manager failover can lead to the pause of the entire job process. This can pose issues in scenarios with frequent node rotations within Kubernetes and large jobs deployed with hundreds of task managers. For context, node rotation on Kubernetes is performed to ensure the operability and stability of the cluster. It involves replacing existing nodes with new ones, typically with updated configurations or to perform maintenance tasks, with the goals of applying host configuration changes, maintaining node balance and enhancing operational efficiency. In comparison, node rotations on Yarn occur less frequently, so the impact on job availability is less significant. We will explore how we are mitigating these challenges in the Future Work section.</p><h1 id="1865" class="oh oi gt be oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe bj">Components Deep Dive</h1><p id="1c88" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">Below is an overview of our current architecture:</p><figure class="nw nx ny nz oa ob nt nu paragraph-image"><div role="button" tabindex="0" class="oc od fi oe bg of"><div class="nt nu qk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*V2H_S-f6NIaDvtsR6ozF8g.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*V2H_S-f6NIaDvtsR6ozF8g.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*V2H_S-f6NIaDvtsR6ozF8g.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*V2H_S-f6NIaDvtsR6ozF8g.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*V2H_S-f6NIaDvtsR6ozF8g.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*V2H_S-f6NIaDvtsR6ozF8g.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*V2H_S-f6NIaDvtsR6ozF8g.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*V2H_S-f6NIaDvtsR6ozF8g.png 640w, https://miro.medium.com/v2/resize:fit:720/1*V2H_S-f6NIaDvtsR6ozF8g.png 720w, https://miro.medium.com/v2/resize:fit:750/1*V2H_S-f6NIaDvtsR6ozF8g.png 750w, https://miro.medium.com/v2/resize:fit:786/1*V2H_S-f6NIaDvtsR6ozF8g.png 786w, https://miro.medium.com/v2/resize:fit:828/1*V2H_S-f6NIaDvtsR6ozF8g.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*V2H_S-f6NIaDvtsR6ozF8g.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*V2H_S-f6NIaDvtsR6ozF8g.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="bfac" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">To provide a better understanding of the system, below is a deep dive of the five primary components, as well as how users interact with them when setting up a new Flink job:</p><ul class=""><li id="1582" class="mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns qc qd qe bj"><strong class="mx gu">Job configurations: </strong>This serves as an abstraction layer over Kubernetes and CI/CD components, providing Flink users with a simplified interface for creating Flink application templates. It shields users from the complexities of the underlying Kubernetes infrastructure. Flink users define the core specifications of their Flink job via a configuration file. This includes critical information like the entrypoint class name, job parallelism, and the necessary ingress services and sinks.</li></ul><figure class="nw nx ny nz oa ob nt nu paragraph-image"><div role="button" tabindex="0" class="oc od fi oe bg of"><div class="nt nu ql"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*GznnjNbU0UNAyuHh_z3jDQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*GznnjNbU0UNAyuHh_z3jDQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*GznnjNbU0UNAyuHh_z3jDQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*GznnjNbU0UNAyuHh_z3jDQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*GznnjNbU0UNAyuHh_z3jDQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*GznnjNbU0UNAyuHh_z3jDQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*GznnjNbU0UNAyuHh_z3jDQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*GznnjNbU0UNAyuHh_z3jDQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*GznnjNbU0UNAyuHh_z3jDQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*GznnjNbU0UNAyuHh_z3jDQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*GznnjNbU0UNAyuHh_z3jDQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*GznnjNbU0UNAyuHh_z3jDQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*GznnjNbU0UNAyuHh_z3jDQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*GznnjNbU0UNAyuHh_z3jDQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><ul class=""><li id="ad86" class="mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns qc qd qe bj"><strong class="mx gu">Image management: </strong>This component involves the pre-construction of Flink base images, which are bundled with essential dependencies required to access Airbnb resources. These images are stored in Amazon Elastic Container Registry and can be readily deployed with user Jars or further customized to meet specific user needs.</li></ul><figure class="nw nx ny nz oa ob nt nu paragraph-image"><div role="button" tabindex="0" class="oc od fi oe bg of"><div class="nt nu qm"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*Qc3AC_55cp7Tl-a1c0z9nQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><ul class=""><li id="5848" class="mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns qc qd qe bj"><strong class="mx gu">CI/CD</strong>: By introducing a few customizations to support Flink’s stateful deployment, we’ve integrated Flink with our existing CI/CD system, providing a standardized version control and continuous delivery experience. Flink jobs are deployed within Kubernetes, each residing in its distinct namespace to ensure isolation and effective administration.</li></ul><figure class="nw nx ny nz oa ob nt nu paragraph-image"><div role="button" tabindex="0" class="oc od fi oe bg of"><div class="nt nu qn"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*ZfK6dPKgG0E2bX6l 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*ZfK6dPKgG0E2bX6l 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*ZfK6dPKgG0E2bX6l 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*ZfK6dPKgG0E2bX6l 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*ZfK6dPKgG0E2bX6l 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*ZfK6dPKgG0E2bX6l 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*ZfK6dPKgG0E2bX6l 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*ZfK6dPKgG0E2bX6l 640w, https://miro.medium.com/v2/resize:fit:720/0*ZfK6dPKgG0E2bX6l 720w, https://miro.medium.com/v2/resize:fit:750/0*ZfK6dPKgG0E2bX6l 750w, https://miro.medium.com/v2/resize:fit:786/0*ZfK6dPKgG0E2bX6l 786w, https://miro.medium.com/v2/resize:fit:828/0*ZfK6dPKgG0E2bX6l 828w, https://miro.medium.com/v2/resize:fit:1100/0*ZfK6dPKgG0E2bX6l 1100w, https://miro.medium.com/v2/resize:fit:1400/0*ZfK6dPKgG0E2bX6l 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><ul class=""><li id="4643" class="mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns qc qd qe bj"><strong class="mx gu">Flink portal: </strong>an API service that offers essential features for managing the states of Flink jobs. These features include stopping a Flink job with a savepoint and querying completed checkpoints on Amazon S3. Additionally, it provides a self-service UI portal, enabling users to monitor and check the status of their jobs. Users also gain access to critical job state management functionalities, empowering them to either initiate the job from a bootstrapped savepoint or resume it from a previous checkpoint.</li></ul><figure class="nw nx ny nz oa ob nt nu paragraph-image"><div role="button" tabindex="0" class="oc od fi oe bg of"><div class="nt nu qo"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*T_COfWpufKSxiKI2IPapfg.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*T_COfWpufKSxiKI2IPapfg.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*T_COfWpufKSxiKI2IPapfg.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*T_COfWpufKSxiKI2IPapfg.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*T_COfWpufKSxiKI2IPapfg.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*T_COfWpufKSxiKI2IPapfg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*T_COfWpufKSxiKI2IPapfg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*T_COfWpufKSxiKI2IPapfg.png 640w, https://miro.medium.com/v2/resize:fit:720/1*T_COfWpufKSxiKI2IPapfg.png 720w, https://miro.medium.com/v2/resize:fit:750/1*T_COfWpufKSxiKI2IPapfg.png 750w, https://miro.medium.com/v2/resize:fit:786/1*T_COfWpufKSxiKI2IPapfg.png 786w, https://miro.medium.com/v2/resize:fit:828/1*T_COfWpufKSxiKI2IPapfg.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*T_COfWpufKSxiKI2IPapfg.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*T_COfWpufKSxiKI2IPapfg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><ul class=""><li id="8652" class="mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns qc qd qe bj"><strong class="mx gu">Flink job runtime:</strong> Each Flink job is deployed as an independent application cluster on Kubernetes. To ensure fault tolerance and state storage, Zookeeper, ETCD, and Amazon S3 are utilized. Additionally, pre-configured sidecar containers accompany the Flink containers to provide support for critical functions such as logging, metrics, DNS, and more. A service mesh is employed to facilitate communication between Flink jobs and other microservices.</li></ul><figure class="nw nx ny nz oa ob nt nu paragraph-image"><div role="button" tabindex="0" class="oc od fi oe bg of"><div class="nt nu qk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*e5egS14K6F1Cr9VRyXdXXw.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*e5egS14K6F1Cr9VRyXdXXw.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*e5egS14K6F1Cr9VRyXdXXw.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*e5egS14K6F1Cr9VRyXdXXw.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*e5egS14K6F1Cr9VRyXdXXw.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*e5egS14K6F1Cr9VRyXdXXw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*e5egS14K6F1Cr9VRyXdXXw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*e5egS14K6F1Cr9VRyXdXXw.png 640w, https://miro.medium.com/v2/resize:fit:720/1*e5egS14K6F1Cr9VRyXdXXw.png 720w, https://miro.medium.com/v2/resize:fit:750/1*e5egS14K6F1Cr9VRyXdXXw.png 750w, https://miro.medium.com/v2/resize:fit:786/1*e5egS14K6F1Cr9VRyXdXXw.png 786w, https://miro.medium.com/v2/resize:fit:828/1*e5egS14K6F1Cr9VRyXdXXw.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*e5egS14K6F1Cr9VRyXdXXw.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*e5egS14K6F1Cr9VRyXdXXw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="2a07" class="oh oi gt be oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe bj">Impact</h1><h2 id="8c95" class="pl oi gt be oj pm pn dx on po pp dz or ng pq pr ps nk pt pu pv no pw px py pz bj">Improved Developer Velocity</h2><p id="86e7" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">Onboarding Flink jobs is faster, where our developers noted that it takes hours instead of days, and developers can focus more on their application logic.</p><h2 id="3bad" class="pl oi gt be oj pm pn dx on po pp dz or ng pq pr ps nk pt pu pv no pw px py pz bj">Improvement in Flink Job Availability and Latency</h2><p id="09a3" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">The architecture of Flink on Kubernetes improves job availability and scheduling latency by eliminating certain components of the Flink client and job scheduler found in Flink on Yarn.</p><h2 id="5326" class="pl oi gt be oj pm pn dx on po pp dz or ng pq pr ps nk pt pu pv no pw px py pz bj">Cost Savings in Infrastructure</h2><p id="cd8b" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">The streamlining of Flink infrastructure complexity and the removal of certain components, such as the job scheduler, have resulted in cost savings in our infrastructure. Additionally, by running Flink jobs on a shared Kubernetes cluster at Airbnb, we could potentially improve the overall cost efficiency of our company’s infrastructure.</p><h1 id="1d5b" class="oh oi gt be oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe bj">Future Work</h1><h2 id="c7de" class="pl oi gt be oj pm pn dx on po pp dz or ng pq pr ps nk pt pu pv no pw px py pz bj">Improvement in Job Availability</h2><p id="c837" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">In the Flink world, node rotations in Kubernetes can cause job restarts and result in downtime. While Flink itself can recover from job restarts without data loss, the potential downtime and availability impact may be unfavorable for highly latency-sensitive applications. To address this, there are a few approaches we are evaluating.</p><ol class=""><li id="7388" class="mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns qp qd qe bj">Reducing the number of node rotations to minimize job restarts.</li><li id="b4ac" class="mv mw gt mx b my qf na nb nc qg ne nf ng qh ni nj nk qi nm nn no qj nq nr ns qp qd qe bj">Faster job recovery.</li></ol><h2 id="ce92" class="pl oi gt be oj pm pn dx on po pp dz or ng pq pr ps nk pt pu pv no pw px py pz bj">Enable Job Autoscaling</h2><p id="7f73" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">With the introduction of <a class="af pk" href="https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/" rel="noopener ugc nofollow" target="_blank">Reactive Mode</a> in Flink 1.13, users can dynamically adjust the parallelism of their jobs without the need for a job restart. This job auto scaling feature can enhance job stability and cost efficiency. In the future we could enable autoscaling for Flink Kubernetes workloads by leveraging system metrics (such as CPU usage) and Flink metrics (such as backpressure), to determine the appropriate parallelism.</p><h2 id="44f0" class="pl oi gt be oj pm pn dx on po pp dz or ng pq pr ps nk pt pu pv no pw px py pz bj">Flink Kubernetes Operator</h2><p id="750b" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">The <a class="af pk" href="https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/" rel="noopener ugc nofollow" target="_blank">Flink Kubernetes Operator</a> utilizes Custom Resources and functions as a controller to manage the entire production lifecycle of Flink applications. By leveraging the operator, we can streamline the operation and deployment processes for Flink jobs. It provides better control over deployment and lifecycle of jobs, and an out of box solution for autoscaling and auto tuning.</p><h1 id="8339" class="oh oi gt be oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe bj">Conclusion</h1><p id="b244" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">To summarize, the migration of Airbnb’s streaming processing architecture based on Apache Flink from Hadoop Yarn to Kubernetes has been a significant milestone in enhancing our streaming data processing capabilities. This transition has resulted in a more streamlined and user-friendly experience for Flink developers. By overcoming challenges that were complex to address on Yarn, we have laid the foundation for more efficient and effective streaming data processing.</p><p id="bd8a" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">As we look ahead, we are committed to further refining our approach and resolving any remaining challenges. We are enthusiastic about the ongoing growth and potential of Apache Flink within our company, and we anticipate continued innovation and improvement in the future.</p><p id="0c70" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">If this kind of work sounds appealing to you, check out our <a class="af pk" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">open roles</a> — we’re hiring!</p><h1 id="fcdd" class="oh oi gt be oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe bj">Appreciations</h1><p id="34fc" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj">The Flink on Kubernetes platform would not have been possible without cross-functional and cross-org collaborators as well as leadership support. They include, but are not limited to: Jingwei Lu, Long Zhang, Daniel Low, Weibo He, Zack Loebel-Begelman, Justin Cunningham, Adam Kocoloski, Liyin Tang and Nathan Towery.</p><p id="6eda" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">Special thanks to the broader Airbnb data community members who provided input or aid to the implementation team throughout the design, development, and launch phases.</p><p id="4277" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj">We also want to thank Wei Hou and Xu Zhang for their support in authoring this post during their time at Airbnb.</p><h1 id="b11e" class="oh oi gt be oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe bj">****************</h1><p id="17cc" class="pw-post-body-paragraph mv mw gt mx b my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no pj nq nr ns gm bj"><em class="qq">Apache Spark™, Apache Airflow™, and Apache ZooKeeper™ are trademarks of The Apache Software Foundation.</em></p><p id="e9e6" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj"><em class="qq">Apache Flink® and Apache Hadoop® are registered trademarks of The Apache Software Foundation.</em></p><p id="9f1e" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj"><em class="qq">Kubernetes® is a registered trademark of The Linux Foundation.</em></p><p id="ff92" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj"><em class="qq">Amazon S3 and AWS are trademarks of Amazon.com, Inc. or its affiliates.</em></p><p id="62d8" class="pw-post-body-paragraph mv mw gt mx b my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns gm bj"><em class="qq">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/apache-flink-on-kubernetes-84425d66ee11</link>
      <guid>https://medium.com/airbnb-engineering/apache-flink-on-kubernetes-84425d66ee11</guid>
      <pubDate>Wed, 31 Jul 2024 19:04:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[How Airbnb Smoothly Upgrades React]]></title>
      <description><![CDATA[<div class="gm gn go gp gq"><div class="ab ca"><div class="ch bg fy fz ga gb"><div><div><h2 id="0ce3" class="pw-subtitle-paragraph hq gs gt be b hr hs ht hu hv hw hx hy hz ia ib ic id ie if cp dt">Incrementally modernizing our frontend infrastructure to roll out the latest React features without downgrades</h2><div></div><figure class="nk nl nm nn no np nh ni paragraph-image"><div role="button" tabindex="0" class="nq nr fi ns bg nt"><div class="nh ni nj"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*8jiO3WebwJ4aoyYFYltkOw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="73a7" class="nv nw gt be nx ny nz ht oa ob oc hw od oe of og oh oi oj ok ol om on oo op oq bj">Introduction</h1><p id="e755" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">Airbnb’s frontend recently reached a major milestone: all of our web surfaces have been upgraded from React 16 to React 18, the current major version of React¹. This was a big project for a product with many surfaces, including Guest and Host pages as well as many internal tools. To safely perform this upgrade, we created the React Upgrade System: reusable infrastructure that allows us to roll out new versions of React progressively across our monorepo and measure the results of the upgrade. In this blog post, we’ll discuss our upgrade philosophy, the system we created, and what we learned from performing this upgrade.</p><p id="c530" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">While this post primarily focuses on React, the system and lessons are applicable to many web frameworks and libraries that require regular upgrades.</p><h2 id="3c6d" class="ps nw gt be nx pt pu dx oa pv pw dz od pa px py pz pe qa qb qc pi qd qe qf qg bj">Challenges of upgrading</h2><p id="b989" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">Upgrading dependencies is a common task in any long-lived project. Upgrades fix bugs, improve performance, and unlock new APIs. Some upgrades are simple, but upgrades become more difficult when large amounts of product code rely on changed APIs or subtle assumptions about behavior. In Airbnb’s web monorepo, we only allow one version of each top-level dependency (with some rare exceptions), with one package.json at the root of the repo. This ensures that code within the monorepo is internally compatible and consistent, and that we avoid shipping duplicate packages to users. Before the upgrade system, having a <em class="qh">single version </em>of each dependency meant performing an atomic update, which requires a huge amount of up-front migration work, a long-running upgrade branch, and a single milestone when it is finally deployed to users. Such an approach is error-prone and risky, thus requiring a “heroic” engineering effort to ship clean upgrades.</p><p id="9ece" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">Ideally, we’d be shipping small, incremental upgrades that have no issues. Without some way to test and progressively roll out this system to a large monorepo, we often needed to try upgrading multiple times, downgrading whenever any problems were found. Performance regressions were particularly difficult to catch using this upgrade strategy. Because there was no way to collect performance data prior to release, we went straight from 0% to 100% rollout on deployment.</p><figure class="nk nl nm nn no np nh ni paragraph-image"><div role="button" tabindex="0" class="nq nr fi ns bg nt"><div class="nh ni qi"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*Kceo280VknZ1XaA4X6A7kQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*Kceo280VknZ1XaA4X6A7kQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*Kceo280VknZ1XaA4X6A7kQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*Kceo280VknZ1XaA4X6A7kQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*Kceo280VknZ1XaA4X6A7kQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*Kceo280VknZ1XaA4X6A7kQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Kceo280VknZ1XaA4X6A7kQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*Kceo280VknZ1XaA4X6A7kQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*Kceo280VknZ1XaA4X6A7kQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*Kceo280VknZ1XaA4X6A7kQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*Kceo280VknZ1XaA4X6A7kQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*Kceo280VknZ1XaA4X6A7kQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*Kceo280VknZ1XaA4X6A7kQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*Kceo280VknZ1XaA4X6A7kQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qj fe qk nh ni ql qm be b bf z dt">Ideal vs Reality graphs of our major and minor versions of React over time.</figcaption></figure><p id="91c5" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">Our goal with the React Upgrade System was to make more seamless upgrading less heroic and more routine. Specifically, our goals were to be able to:</p><ol class=""><li id="1195" class="or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm qn qo qp bj"><strong class="ot gu">Upgrade incrementally</strong> so that we get feedback and learn lessons as soon as we can.</li><li id="5e0b" class="or os gt ot b hr qq ov ow hu qr oy oz pa qs pc pd pe qt pg ph pi qu pk pl pm qn qo qp bj"><strong class="ot gu">Upgrade often</strong> so that the delta between our version and the upgraded version is as small as possible.</li><li id="6707" class="or os gt ot b hr qq ov ow hu qr oy oz pa qs pc pd pe qt pg ph pi qu pk pl pm qn qo qp bj"><strong class="ot gu">Test upgrades </strong>so that we can precisely measure the performance impact of upgrades and make informed decisions about upgrade paths using this data.</li></ol><h1 id="f9df" class="nv nw gt be nx ny nz ht oa ob oc hw od oe of og oh oi oj ok ol om on oo op oq bj">Designing the React Upgrade System</h1><p id="8331" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">Working backwards from these goals, we started to get an idea of what our ideal architecture would look like. We wanted to avoid a long-running upgrade branch so that we could upgrade incrementally, and we wanted to be able to A/B test the upgrade so that we would get feedback from production to inform shipping decisions.</p><figure class="nk nl nm nn no np nh ni paragraph-image"><div role="button" tabindex="0" class="nq nr fi ns bg nt"><div class="nh ni qv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*Vz9hMOfWVbwEjHvS2Ym9cQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qj fe qk nh ni ql qm be b bf z dt"><em class="qw">Simplified diagram of our ideal upgrade system</em></figcaption></figure><p id="17d3" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">There were a couple of problems to solve with the most naive implementation of this system: we needed to pick a single version of React for rendering, and it was challenging to dynamically switch between the two versions at runtime. Here’s what the code would look like to render a basic application using this naive approach:</p><pre class="nk nl nm nn no qx qy qz bo ra ba bj">import React18 from 'react'; <br />import React16 from 'react'; // duplicated import?if (shouldEnableReact18()) {<br />  const root = React18.createRoot(container);<br />  root.render(&lt;App /&gt;);<br />} else {<br />  React16.render(&lt;App /&gt;, container);<br />}</pre><p id="2f5a" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">There are two issues with this:</p><ol class=""><li id="3376" class="or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm qn qo qp bj">We don’t want to bundle both versions of React in the application, or we’ll double our framework bundle size. Further, we might need to change the JSX transformation being used at build time, making our <code class="cw rg rh ri qy b">&lt;App /&gt;</code> incompatible with one version or the other.</li><li id="8048" class="or os gt ot b hr qq ov ow hu qr oy oz pa qs pc pd pe qt pg ph pi qu pk pl pm qn qo qp bj">It’s not clear where the imports should come from. The ‘react’ dependency will point to either React 16 or React 18, but not to both.</li></ol><p id="8727" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">To solve these problems, we used <strong class="ot gu">module aliasing</strong> to split the versions, and <strong class="ot gu">environment targeting </strong>to build and run the two split versions of React.</p><h2 id="d3e9" class="ps nw gt be nx pt pu dx oa pv pw dz od pa px py pz pe qa qb qc pi qd qe qf qg bj">Module aliasing</h2><p id="8d2c" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">We addressed the problem of where these imports are coming from using module aliasing. Using yarn, we added another react dependency to our package.json, e.g.,</p><pre class="nk nl nm nn no qx qy qz bo ra ba bj">"react-18": "npm:react@18"</pre><p id="014c" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">which allowed us to import React from the ‘react-18’ package. This got us part of the way there. Many tools (such as custom resolvers and build systems) need to know which of the two versions to use. To centralize the logic, we wired up all of our custom tooling into a central, “global alias” configuration. This global alias configuration allowed us to alias in one place for all of our different tools. Babel, Jest™, Webpack™, and other custom resolution logic all need to be aware of the conditions under which we want to redirect imports from ‘react’ to ‘react-18’. Aliasing the modules with our “global alias” configuration meant that user code did not need to change at all, and we were able to handle this redirect behind the scenes.</p><h2 id="78f1" class="ps nw gt be nx pt pu dx oa pv pw dz od pa px py pz pe qa qb qc pi qd qe qf qg bj">TypeScript discrepancies</h2><p id="5f0b" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">Given that any component could be run in React 16 or 18, we wanted to use the types for each component that work across both versions during our upgrade period. Thankfully, the React team has maintained backwards compatibility, even between major versions.</p><p id="8f26" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">We installed the types for React 18, and for newly added APIs in React 18, we created a shim layer for these APIs that worked in both React 16 and 18 (for example, <a class="af rj" href="https://react.dev/blog/2022/03/29/react-v18#usetransition" rel="noopener ugc nofollow" target="_blank">useTransition</a> acted as a no-op in 16). For APIs with no possible shim (for example, <a class="af rj" href="https://react.dev/reference/react/useId" rel="noopener ugc nofollow" target="_blank">useId</a>), we indicated through type augmentation that this hook may be undefined at runtime.</p><p id="0d12" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">For TypeScript-only <a class="af rj" href="https://github.com/DefinitelyTyped/DefinitelyTyped/issues/46691" rel="noopener ugc nofollow" target="_blank">breaking changes in React 18</a>, we waited until the React 18 upgrade was complete before incrementally fixing these. We augmented the types to patch differences to allow progressively fixing these new Typescript errors in our monorepo.</p><h2 id="760e" class="ps nw gt be nx pt pu dx oa pv pw dz od pa px py pz pe qa qb qc pi qd qe qf qg bj">Environment targeting</h2><p id="2fda" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">To solve the problem of duplicated imports, we needed to produce two different build artifacts: one containing React 16 and one containing React 18. Let’s call these the “control” and “treatment” artifacts, respectively. Since Airbnb uses Server-Side Rendering (SSR), we also needed to run these two different artifacts in different node processes on the server. Using Kubernetes®, we set up two different Kubernetes environments that ran these control and treatment artifacts. Let’s call this setup <strong class="ot gu">environment targeting</strong>.</p><figure class="nk nl nm nn no np nh ni paragraph-image"><div role="button" tabindex="0" class="nq nr fi ns bg nt"><div class="nh ni qi"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*QXkDtlSK2x1Hy8X9Tg-KQQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qj fe qk nh ni ql qm be b bf z dt"><strong class="be nx"><em class="qw">Module aliasing </em></strong><em class="qw">and </em><strong class="be nx"><em class="qw">environment targeting</em></strong><em class="qw"> in use together to deploy different versions of the framework together in production</em></figcaption></figure><p id="644d" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">We also wrote an environment variable (REACT_UPGRADE) to our assets at build time and set this variable at runtime in our node SSR service. This lets us perform conditional logic that might be necessarily on only one or the other side of our upgrade system.</p><p id="cf07" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">This setup also worked for us in local development. Our “local” development environments were also deployed, so we were able to configure the React version for local development in the same way as production using this setup. As each SSR service was upgraded to React 18, we also switched the development environment for that service to React 18 to keep production and local development versions synchronized.</p><h2 id="98d6" class="ps nw gt be nx pt pu dx oa pv pw dz od pa px py pz pe qa qb qc pi qd qe qf qg bj">Testing the upgrade</h2><p id="6e63" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">Airbnb has a comprehensive test suite, which was helpful for building confidence in the safety of this upgrade before exposing the upgrade to users. Our test suite includes visual regression testing, integration testing, and unit testing. Before launching to users, we fixed all new failures in each of these suites.</p><p id="8119" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">Unit tests were the hardest to abstract from framework internals. Because we use a <a class="af rj" rel="noopener" href="https://medium.com/airbnb-engineering/phase-ii-enzyme-d9efa717e297">combination of Enzyme and React Testing Library</a>, we needed to fix assumptions about APIs and framework internals in unit tests, shims, and adapters. To achieve this, we ran all of our unit tests under both React 16 and 18, allowing existing failures in the React 18 test suite as we progressively fixed them. We used this “permitted failures” list to ratchet down the number of test failures over time, which prevented backsliding, as no new failures were allowed on the list. This approach allowed us to fix problems incrementally with components and our test environments.</p><p id="8646" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">We tracked the work of resolving hundreds of test failures with dashboards, merged fixes incrementally using the upgrade system, and split the work amongst a handful of developers. This made the migration work largely transparent to the broader frontend team and helped us gain confidence in the upgrade before rollout.</p><h2 id="3f6f" class="ps nw gt be nx pt pu dx oa pv pw dz od pa px py pz pe qa qb qc pi qd qe qf qg bj">Progressive rollout</h2><p id="96e4" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">Once we had <strong class="ot gu">module aliasing </strong>and <strong class="ot gu">environment targeting</strong>, we had the capability to author and deliver code for two different versions of React, all from the same codebase. To ensure safety and testability, we also needed a way of rolling out this new environment progressively. To reduce the amount of change happening at once, we wanted to control the roll out across traffic and product surfaces. Our experimentation infrastructure allowed us to direct traffic to each of our two production environments (control and treatment) at will. This setup also allowed us to test the upgrade internally first, and to completely turn off the upgrade if issues were found.</p><p id="9cf9" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">Controlling the rollout to different surfaces is more difficult. Within a Single Page App, managing multiple React versions would mean unmounting and mounting React roots. This would lead to poor performance and degrade the user experience.</p><p id="c0e1" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">For this reason, we managed the surface rollout upgrade at the app level. Airbnb’s monorepo houses many Single Page Apps, so it was useful to have the react upgrade system in place to be able to turn the upgrade on and off for each of these apps. Using our React Upgrade System, we were able to roll this out to a single app internally first, giving developers a way to opt-in and opt-out of the upgrade for testing, in both development and on our staging sites. This approach let us avoid having a long running feature branch, helping us achieve our goal of incremental upgrades.</p><h1 id="3fab" class="nv nw gt be nx ny nz ht oa ob oc hw od oe of og oh oi oj ok ol om on oo op oq bj">Feature adoption and future work</h1><p id="2ccc" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">Using the system, we completely rolled out React 18 to all web surfaces at Airbnb, with no rollbacks required. After the upgrade, we were able to start testing new APIs such as <a class="af rj" href="https://react.dev/reference/react-dom/client/createRoot" rel="noopener ugc nofollow" target="_blank">new root APIs</a> and <a class="af rj" href="https://react.dev/reference/react/startTransition" rel="noopener ugc nofollow" target="_blank">concurrent rendering features</a>. We intentionally held off for a few weeks on adopting these features until the upgrade had settled. This way we could be confident that we wouldn’t need to downgrade and have to revert code changes.</p><p id="094b" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">It’s been exciting to see performance improvements from adopting these new features, and we are continuing to experiment with expanding them to key UI surfaces that would benefit.</p><p id="0493" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">To ensure that our goal of upgrading is often met, we will use the React Upgrade System to test the <a class="af rj" href="https://react.dev/community/versioning-policy#canary-channel" rel="noopener ugc nofollow" target="_blank">canary channel of React</a>. Instead of pointing to React 18, we can just point at the canary tag and get a preview of what migration work needs to be happening now for <a class="af rj" href="https://react.dev/blog/2024/02/15/react-labs-what-we-have-been-working-on-february-2024#the-next-major-version-of-react" rel="noopener ugc nofollow" target="_blank">React 19</a>. To make upgrading not require a heroic effort, staying current should be a continual effort spread out over time, rather than a large, one-off change.</p><h1 id="4ee5" class="nv nw gt be nx ny nz ht oa ob oc hw od oe of og oh oi oj ok ol om on oo op oq bj">Conclusion</h1><p id="1121" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">Our goals for the React Upgrade System were to enable us to <strong class="ot gu">upgrade incrementally, test upgrades</strong>, and <strong class="ot gu">upgrade often. </strong>Combining environment targeting and our aliasing system has allowed us to upgrade incrementally and test the upgrades. We’re beginning to run our frontend against React 19 beta, getting a head start on React 19.</p><p id="df49" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">We’d like to acknowledge the React team for putting effort into backwards compatibility between React versions, even major versions. Without that effort, this upgrade approach would not be possible.</p><p id="e1e6" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">Using the React Upgrade System, we gained confidence in our rollout of React 18, and will use this approach for future upgrades. We believe investing in an upgrading system is worthwhile, as upgrades will continue to be needed over time. The React Upgrade System has allowed us to test and roll out upgrades incrementally, ensuring that we’re delivering the best user experience and performance possible for our users.</p><p id="293a" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">If this kind of work sounds appealing to you, check out our <a class="af rj" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">open roles</a> — we’re hiring!</p><h1 id="24af" class="nv nw gt be nx ny nz ht oa ob oc hw od oe of og oh oi oj ok ol om on oo op oq bj">Acknowledgments</h1><p id="4cab" class="pw-post-body-paragraph or os gt ot b hr ou ov ow hu ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm gm bj">Thanks to Josh Nelson, Kim Nguyen, Andre Wiggins, Callie Riggins Zetino, James Robinson, Dan Beam, Kaeson Ho, Rae Liu, Michael James, Noah Sugarman, Laurie Jin, Brie Bunge, Matt Mulder, Victor Lin for their assistance on creating the React Upgrade System and the pieces comprising it.</p></div></div></div><div class="ab ca rk rl rm rn" role="separator"><div class="gm gn go gp gq"><div class="ab ca"><div class="ch bg fy fz ga gb"><p id="7c76" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj">[1]: React 17 was <a class="af rj" href="https://legacy.reactjs.org/blog/2020/10/20/react-v17.html" rel="noopener ugc nofollow" target="_blank">released in 2020</a> as a no-feature “stepping stone” release with minimal breaking changes. By the time we were working on this upgrade, React 18 had been released, so we opted to upgrade directly to 18. As of writing, React 19 is in beta, and we are reusing our React Upgrade System for React 19.</p><p id="6464" class="pw-post-body-paragraph or os gt ot b hr pn ov ow hu po oy oz pa pp pc pd pe pq pg ph pi pr pk pl pm gm bj"><em class="qh">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/how-airbnb-smoothly-upgrades-react-b1d772a565fd</link>
      <guid>https://medium.com/airbnb-engineering/how-airbnb-smoothly-upgrades-react-b1d772a565fd</guid>
      <pubDate>Tue, 23 Jul 2024 19:02:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Rethinking Text Resizing on Web]]></title>
      <description><![CDATA[<div class="ab ca"><div class="ch bg fy fz ga gb"><div><div class="hu hv hw hx hy"></div><figure class="nd ne nf ng nh ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fi nl bg nm"><div class="na nb nc"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*iM9vFw9B1-jUopP3lX1Swg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="9c13" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Airbnb has made significant strides in improving web accessibility for Hosts and guests who require larger text sizes.</p><p id="5444" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">This post takes an in-depth look at:</p><ol class=""><li id="add7" class="no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on oo bj">The problems encountered on mobile web when relying solely on browser zoom.</li><li id="afef" class="no np gt nq b nr op nt nu nv oq nx ny nz or ob oc od os of og oh ot oj ok ol om on oo bj">The challenges of introducing changes that would impact the workflow of all frontend engineers.</li><li id="4ba6" class="no np gt nq b nr op nt nu nv oq nx ny nz or ob oc od os of og oh ot oj ok ol om on oo bj">The benefits seen since launching these accessibility improvements.</li></ol><p id="e8d2" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">by: <a class="af ou" href="https://www.linkedin.com/in/bassettsj/" rel="noopener ugc nofollow" target="_blank">Steven Bassett</a></p><p id="1697" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Improving web accessibility is a critical priority at Airbnb, and we use the Web Content Accessibility Guidelines (WCAG) to help guide our compliance efforts. One area that often leads to accessibility issues is <a class="af ou" href="https://www.w3.org/WAI/WCAG21/Understanding/resize-text.html" rel="noopener ugc nofollow" target="_blank">WCAG 1.4.4 Resize Text (Level AA)</a>. This guideline, which we’ll refer to as Resize Text, is particularly beneficial for people with low vision, whether correctable or not (for example with glasses or prescription contacts). The standard specifies that web content and functionality must be maintained when text is scaled 200% (2x) of its original size. Ensuring our site meets this guideline is an important part of our ongoing work to enhance accessibility for all of our users.</p><p id="9c3c" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">In this blog post, we’ll explore our investigation into the importance of this guideline, how we analyzed our site issues, the technical benefits for using rem units, how we decided on an approach, the cross-browser support issues we encountered, and the benefits we saw in reducing the number of reported issues for Resize Text.</p><h1 id="cded" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Meeting the Needs of Users with Vision Difficulties</h1><blockquote class="pt"><p id="3819" class="pu pv gt be pw px py pz qa qb qc ol dt"><strong class="al">“90 million Americans over 40 have vision and eye problems. That’s more than 3 in 5.”</strong></p></blockquote><p id="a309" class="pw-post-body-paragraph no np gt nq b nr qd nt nu nv qe nx ny nz qf ob oc od qg of og oh qh oj ok ol gm bj"><a class="af ou" href="https://www.cdc.gov/visionhealth/resources/infographics/future.html" rel="noopener ugc nofollow" target="_blank"><em class="qi">Looking Ahead: Improving Our Vision for the Future” CDC</em></a></p><p id="a5f1" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">To illustrate, consider how the Airbnb homepage might appear to someone who has experienced a significant loss of visual acuity. As shown below, the text becomes extremely impossible to read comfortably.</p></div></div><div class="ni"><div class="ab ca"><div class="mc qj md qk me ql ce qm cf qn ch bg"><figure class="qp qq qr qs qt ni qu qv paragraph-image"><div role="button" tabindex="0" class="nj nk fi nl bg nm"><div class="na nb qo"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*bdwe8Y9LneU-4vJgEg5mTA.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*bdwe8Y9LneU-4vJgEg5mTA.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*bdwe8Y9LneU-4vJgEg5mTA.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*bdwe8Y9LneU-4vJgEg5mTA.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*bdwe8Y9LneU-4vJgEg5mTA.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*bdwe8Y9LneU-4vJgEg5mTA.png 1100w, https://miro.medium.com/v2/resize:fit:2000/format:webp/1*bdwe8Y9LneU-4vJgEg5mTA.png 2000w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*bdwe8Y9LneU-4vJgEg5mTA.png 640w, https://miro.medium.com/v2/resize:fit:720/1*bdwe8Y9LneU-4vJgEg5mTA.png 720w, https://miro.medium.com/v2/resize:fit:750/1*bdwe8Y9LneU-4vJgEg5mTA.png 750w, https://miro.medium.com/v2/resize:fit:786/1*bdwe8Y9LneU-4vJgEg5mTA.png 786w, https://miro.medium.com/v2/resize:fit:828/1*bdwe8Y9LneU-4vJgEg5mTA.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*bdwe8Y9LneU-4vJgEg5mTA.png 1100w, https://miro.medium.com/v2/resize:fit:2000/1*bdwe8Y9LneU-4vJgEg5mTA.png 2000w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px" /></picture></div></div><figcaption class="qw fe qx na nb qy qz be b bf z dt"><em class="ra">Airbnb’s home page with simulated blurry vision.</em></figcaption></figure></div></div></div><div class="ab ca"><div class="ch bg fy fz ga gb"><h1 id="da8c" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Browser Zoom</h1><p id="0e15" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">To better understand the accessibility challenge, let us explore how browser zoom functionality works. You may already be familiar with this feature, using keyboard shortcuts like Command / Ctrl + or Command / Ctrl — to scale all content within a window. When you increase the zoom level beyond 100%, the viewport’s height and width proportionally decrease, while the content is blown up to fit the larger window.</p><p id="fb4a" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">As part of our accessibility testing strategy, we were using browser zoom to test the usability of our pages both on desktop and mobile sizes. Desktop testing showed that our pages did relatively well at the 200% zoom level with our responsive web approach across the site. We saw fewer issues in the overall user experience when compared to mobile web.</p><p id="40ad" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">This works well on desktop, where we serve a smaller breakpoint (e.g., wide to compact) and the viewport is relatively spacious. However, the limitations of browser zoom become more pronounced on mobile web, where the viewport is smaller. If we were to scale the content in a mobile viewport, it would have to fit into a viewport that is half the width and half the height of the original. This can result in significant accessibility issues, as the text and UI elements become extremely difficult to read and interact with. As shown in the image on the right, the ability to view even a single listing within a screen’s worth of space is not possible without scrolling, leading to a frustrating experience.</p><figure class="qp qq qr qs qt ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fi nl bg nm"><div class="na nb rg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*wEVOd9XZZNdg6Yyr 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*wEVOd9XZZNdg6Yyr 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*wEVOd9XZZNdg6Yyr 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*wEVOd9XZZNdg6Yyr 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*wEVOd9XZZNdg6Yyr 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*wEVOd9XZZNdg6Yyr 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*wEVOd9XZZNdg6Yyr 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*wEVOd9XZZNdg6Yyr 640w, https://miro.medium.com/v2/resize:fit:720/0*wEVOd9XZZNdg6Yyr 720w, https://miro.medium.com/v2/resize:fit:750/0*wEVOd9XZZNdg6Yyr 750w, https://miro.medium.com/v2/resize:fit:786/0*wEVOd9XZZNdg6Yyr 786w, https://miro.medium.com/v2/resize:fit:828/0*wEVOd9XZZNdg6Yyr 828w, https://miro.medium.com/v2/resize:fit:1100/0*wEVOd9XZZNdg6Yyr 1100w, https://miro.medium.com/v2/resize:fit:1400/0*wEVOd9XZZNdg6Yyr 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qw fe qx na nb qy qz be b bf z dt"><em class="ra">Airbnb’s homepage shown at browser zoom 100% on the left, and the same screen shown at 200% showing the search and categories are cut off entirely and not able to even see the first listing.</em></figcaption></figure><h1 id="acb6" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Font Scaling</h1><p id="e8ad" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">Font scaling is the term we’ll use to describe the ability to adjust text size independently of overall page zoom. Unlike browser zoom, which scales all content proportionally, Font Scaling applies only to the text elements on the page. This allows users to customize the font size to their preferred reading size without affecting much of layout or responsiveness of the rest of the content.</p><p id="3b00" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Font Scaling, is also the term we will use for scaling the font based on a user’s preferred size. Unlike zoom, this setting will be applied to all sites. Below is an example of how the font scaling applies to just the text on the screen, showing that the only scale of the text increases, instead of all the content.</p><figure class="qp qq qr qs qt ni"><div class="rh jj l fi"></div></figure><p id="69aa" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Video Description:<em class="qi"> Airbnb text is scaled by setting the font size on arc browser, showing the scaling from 16px to 32xp.</em></p><p id="fc49" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">This concept of independent font scaling is similar to the Dynamic Type feature on iOS, as we discussed in our blog post <a class="af ou" href="https://medium.com/airbnb-engineering/tagged/dynamic-type" rel="noopener">“Supporting Dynamic Type at Airbnb”</a>. Dynamic Type allows users to set a preferred system-wide text size, which then automatically adjusts the font size across all compatible apps.</p><p id="5e25" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Considering our existing strategies for accessibility on iOS, incorporating font scaling (vs zoom scaling) into our web accessibility approach was a natural next step to help add parity in approaches across our platforms.</p><h1 id="16c7" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Understanding px, em vs rem</h1><p id="553f" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">Now that we understand why font scaling is so powerful for mobile web, we should focus on why we might choose one CSS length unit over another for supporting font scaling. In this blog post we are only going to focus on px, em and rem but there are other units as well. CSS length units are connected to font scaling because they determine how text and other elements are sized on a web page. Some length units are fixed, meaning they don’t change based on the user’s font size settings, while others are relative, meaning they scale proportionally with the font size.</p><p id="8b19" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Let’s take a deep look at 3 CSS length units and how they relate to font scaling:</p><ul class=""><li id="d527" class="no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol rk on oo bj">px units are the most commonly used on the web, theoretically they should represent one pixel on the screen. They are a fixed unit meaning the rendered value does not change.</li><li id="23e9" class="no np gt nq b nr op nt nu nv oq nx ny nz or ob oc od os of og oh ot oj ok ol rk on oo bj">em units however are a relative unit that are based on the parent element’s font size. The name ‘em’ comes from the width of the capital letter ‘M’ in a given typeface, which was traditionally used as the reference point for font sizes. 1 em unit is equal to the height of the current font size, roughly 16px at the default value. em units scale proportionally, so they can be affected by their parent’s font sizes</li><li id="5320" class="no np gt nq b nr op nt nu nv oq nx ny nz or ob oc od os of og oh ot oj ok ol rk on oo bj">rem units, short for “root em”, are similar to em units in that they are proportional to font size, but they only use the root element (the html element) to calculate their font size. This means that rem units offer font scaling, but are not affected by their parent’s font size.</li></ul><p id="dedd" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">The choice between em and rem units often comes down to the level of control and predictability required for font scaling. While em units can be used, they can lead to cascading font size changes that may be difficult to manage, especially in complex layouts. In contrast, rem units provide a more consistent and predictable approach to font scaling, as they are always relative to the root element’s font size.</p><p id="84a6" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">This is illustrated in the CodePen example, where the different font scaling behaviors of px, em, and rem units are demonstrated. In situations where font scaling is a critical requirement, such as the Airbnb example mentioned, the use of rem units can be a more reliable choice to ensure a consistent and maintainable font scaling solution.</p><figure class="qp qq qr qs qt ni"><div class="rh jj l fi"></div></figure><p id="21aa" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Relative units like rem can be used anywhere a fixed unit like px can be used. However, indiscriminate use of rem units across all properties can lead to unwanted scaling behavior and increased complexity.</p><p id="9e4c" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">In the case of Airbnb, the team decided to prioritize the use of rem units specifically for font scaling, rather than scaling all elements proportionally. This targeted approach provided the key benefit of consistent text scaling, without the potential downsides of scaling every aspect of the layout.</p><p id="4981" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">The rationale behind this decision was twofold:</p><ol class=""><li id="54e3" class="no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on oo bj">Scaling <em class="qi">everything</em> using rem units would have been similar to Browser Zoom and potentially introduced unintended layout issues,</li><li id="8316" class="no np gt nq b nr op nt nu nv oq nx ny nz or ob oc od os of og oh ot oj ok ol om on oo bj">The primary focus was on providing a mobile-friendly font scaling solution. By targeting font sizes with rem units, the team could ensure that the most important content — the text — scaled appropriately.</li></ol><h1 id="ff20" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Enabling a Seamless Transition for Designers and Developers</h1><p id="2724" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">Moving from pixel-based values to rem units as a company-wide change in CSS practice can be a significant challenge, especially when working across multiple teams. The time and effort required to educate designers and frontend developers on the new approach, and to have them convert their existing pixel-based values to rem units, can be a significant barrier to adoption. To address this, the Airbnb team decided to focus on automating the unit conversion process as much as possible, enabling a more seamless transition to the new rem-based system.</p><h1 id="18c6" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Reducing Friction in Design Iterations</h1><p id="d9b0" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">Instead of requiring designers to have to think of new units or introduce some conversion for web only, we decided to continue to author our CSS in px units. This reduced the amount of training required for teams to start using rem units out the gate.</p><p id="f0dd" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">One area we did focus on with our design teams was starting to test their designs using font scaling by leveraging the <a class="af ou" href="https://www.figma.com/community/plugin/892114953056389734/text-resizer-accessibility-checker" rel="noopener ugc nofollow" target="_blank">Text Resizer — Accessibility Checker</a> to help simulate what a design might look like at 2X the font size. This tool helped us spot problems earlier into the design process.</p><h1 id="f7eb" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Addressing the Complexity of Two CSS-in-JS Systems</h1><p id="f837" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">Airbnb is in the process of transitioning from <a class="af ou" href="https://github.com/airbnb/react-with-styles" rel="noopener ugc nofollow" target="_blank">React-with-Styles</a> to a newer approach using <a class="af ou" href="https://linaria.dev/" rel="noopener ugc nofollow" target="_blank">Linaria</a>. While the adoption of Linaria was progressing quickly, we recognized the need to support both styling systems for a consistent experience. Managing the conversion across these two different CSS-in-JS systems posed an additional challenge.</p><h2 id="754f" class="rm ow gt be ox rn ro dx pb rp rq dz pf nz rr rs rt od ru rv rw oh rx ry rz sa bj">Linaria</h2><p id="bcb1" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">By leveraging Linaria’s support for CSS custom properties, the team was able to create new typography theme values that automatically converted the existing pixel-based values to their rem equivalents. This approach allowed the team to introduce the new rem-based theme values in a centralized manner, making them available to child elements. This gave the team the ability to override the rem values on a per-page basis, providing the necessary flexibility during the transition process.</p><pre class="qp qq qr qs qt sb sc sd bo se ba bj">import { typography } from './site-theme';// Loops through the CSS Vars we use for typography and converts them<br />// from px to rem units.<br />const theme: css`<br /> ${getCssVariables({ typography: replacePxWithREMs(typography) })}<br /> // Changes from:<br /> // - body-font-size: 16px;<br /> // To<br />// - body-font-size: 1rem; <br />`;<br />// Use the class name generated from linaria to override the theme <br />// variables for the children of this component.<br />const RemThemeLocalProvider: React.FC = ({ children }) =&gt; {<br /> const cx = useCx();<br /> return &lt;div className={linariaClassNames.theme)}&gt;{children}&lt;/div&gt;;<br />};ty</pre><p id="4868" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Although this approach helped us convert most of the font scaling properties, there were many places in our code that we used pxbased values outside the theme. Linaria’s support for post-CSS plugins made solving these areas relatively easy. We leveraged <a class="af ou" href="https://github.com/cuth/postcss-pxtorem#readme" rel="noopener ugc nofollow" target="_blank">postcss-pxtorem</a> to help target those values more easily. We started by using an allow list, so that we could carefully apply this change to a smaller set of early adopting pages.</p><p id="6fd1" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">It was important that we provided an escape hatch when there was some reason for front-end engineers needing to use px units. Luckily we were able to provide this by using a different casing for the px value like shown below.</p><pre class="qp qq qr qs qt sb sc sd bo se ba bj">/* `px` is converted to `rem` */<br />.convert {<br />  font-size: 16px; /* converted to 1rem */<br />}<br />/* `Px` or `PX` is ignored by `postcss-pxtorem` <br />   but still accepted by browsers */<br />.ignore {<br />  font-size: 200Px;<br />  font-size: clamp(16Px, 2rem, 32Px);<br />}</pre><h2 id="1e0d" class="rm ow gt be ox rn ro dx pb rp rq dz pf nz rr rs rt od ru rv rw oh rx ry rz sa bj">React with Styles</h2><p id="feb2" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">A good amount of our frontend code still uses react-with-styles, so we had to find another way to support these cases with an easy conversion. Through this we created a simple Higher-Order component that made the conversion pretty straightforward. First we created a wrapper for the withStyles function like below, and gave the ability to avoid conversion as well.</p><pre class="qp qq qr qs qt sb sc sd bo se ba bj">export const withRemStyles = (<br />  styleFn?: Nullable&lt;(theme: Theme) =&gt; Styles&gt;,<br />  options?: WithStylesOptions &amp; { disableConvertToRemUnits?: boolean },<br />) =&gt; {<br />  const disableConvertToRemUnits = getDisableConvertToRemUnits(options);<br />   // If conversion is disabled, just return the original withStyles function<br />   if (disableConvertToRemUnits) {<br />     return _withStyles(styleFn, options);<br />    }<br />   // Otherwise, wrap the original style function with a new function <br />   // that converts px to rem<br />   return _withStyles((theme: Theme) =&gt; {<br />     if (styleFn) {<br />     const styles = styleFn(theme);<br />     const remStyles = convertToRem(styles);<br />     return remStyles;<br />   }<br />   return {};<br /> }, options);<br />};</pre><p id="98f3" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Then the convertToRem will look through the keys and values and map a converted value for any of the font sizing attributes. This allowed us to automate the conversion process in a more straightforward way.</p><h1 id="6519" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Improvements for Testing Components</h1><p id="7f48" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">With these two challenges out of the way, we can start testing our components to verify if there are any major issues we might need to resolve before rolling out. In our component documentation and tooling, we built an internal plugin to allow for easier testing by setting the font-size on the html element directly to test with font scaling.</p><p id="965a" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Screenshot testing has helped our teams catch visual regressions. Adding support to allow for setting additional screenshots at different root font sizes has helped our product teams review what the component looks like at different font scales. To do this, we allow for adding additional font sizes to be set when capturing the screenshots so you don’t have to create new component variations just for font scaling.</p><h1 id="0d96" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Font Scaling on Mobile Safari</h1><p id="9b30" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">Supporting font scaling for Mobile Safari was more difficult. Unlike other browsers, there is not a font size preference available in Mobile Safari. However, they have released support for their own font: -apple-system-body but there are some important considerations.</p><p id="6596" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Since macOS High Sierra (10.13), desktop Safari also supports the font preference, but there is not an easy “font size” configuration available in MacOS. Because there can be unexpected behavior on desktop Safari, so we used a @supports statement to prevent this. The code below will only target Mobile Safari.</p><pre class="qp qq qr qs qt sb sc sd bo se ba bj">// Apple's Dynamic Type requires this font family to be used<br />// Only target iOS/iPadOS<br />@supports (font: -apple-system-body) and (-webkit-touch-callout: default) {<br />  :root {<br />    font: -apple-system-body;<br />  }<br />}</pre><p id="3a02" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Another consideration is that the “100%” default font size selected does not equal the standard font size of 16px, but rather 17px. This is a very subtle difference, but it is critical for the design quality bar we aim to achieve at Airbnb. So to resolve this issue, we ended up using an inline head script to normalize the value, by placing it early into the page execution we avoided seeing a change in font size.</p><pre class="qp qq qr qs qt sb sc sd bo se ba bj">(() =&gt; {<br />  // don't do anything if the browser doesn't match the supports statement<br />  if (!CSS.supports('(font: -apple-system-body) and (-webkit-touch-callout: default)')) return;<br />  // Must create an element since the root element styles are not yet parsed.<br />  const div = document.createElement('div');<br />  div.setAttribute('style', 'font: -apple-system-body');<br />  // Body is not available yet so this has to be added to the root element<br />  documentElement.appendChild(div);<br />  const style = getComputedStyle(div);<br />  if (style.fontSize === '17px') {<br />    documentElement.style.setProperty('font-size', '16px');<br />  }<br />  documentElement.removeChild(div);<br />})();</pre><p id="e4b7" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Then when the page loads we use a resize observer to detect if the value changes again to unset or set the font-size property on the html element. This helps us still support scalable fonts, but not have a significant impact on the default font size (100%).</p><h1 id="8cee" class="ov ow gt be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Impact</h1><p id="d1f5" class="pw-post-body-paragraph no np gt nq b nr rb nt nu nv rc nx ny nz rd ob oc od re of og oh rf oj ok ol gm bj">Supporting scalable fonts is an investment that should make a dramatic difference for our Hosts and guests with”with low vision and anyone who benefits from larger font sizes and control over their browsing experience. Below are two examples of the home page showing how the default font size (16px) appears to someone who has blurry vision and what it looks like by doubling the font size (32px). The second image is far more legible and usable.</p></div></div><div class="ni"><div class="ab ca"><div class="mc qj md qk me ql ce qm cf qn ch bg"><figure class="qp qq qr qs qt ni qu qv paragraph-image"><div role="button" tabindex="0" class="nj nk fi nl bg nm"><div class="na nb sk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*GJi2NA2bI5RU7uOqsRQMPw.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*GJi2NA2bI5RU7uOqsRQMPw.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*GJi2NA2bI5RU7uOqsRQMPw.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*GJi2NA2bI5RU7uOqsRQMPw.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*GJi2NA2bI5RU7uOqsRQMPw.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*GJi2NA2bI5RU7uOqsRQMPw.png 1100w, https://miro.medium.com/v2/resize:fit:2000/format:webp/1*GJi2NA2bI5RU7uOqsRQMPw.png 2000w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*GJi2NA2bI5RU7uOqsRQMPw.png 640w, https://miro.medium.com/v2/resize:fit:720/1*GJi2NA2bI5RU7uOqsRQMPw.png 720w, https://miro.medium.com/v2/resize:fit:750/1*GJi2NA2bI5RU7uOqsRQMPw.png 750w, https://miro.medium.com/v2/resize:fit:786/1*GJi2NA2bI5RU7uOqsRQMPw.png 786w, https://miro.medium.com/v2/resize:fit:828/1*GJi2NA2bI5RU7uOqsRQMPw.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*GJi2NA2bI5RU7uOqsRQMPw.png 1100w, https://miro.medium.com/v2/resize:fit:2000/1*GJi2NA2bI5RU7uOqsRQMPw.png 2000w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 1000px" /></picture></div></div><figcaption class="qw fe qx na nb qy qz be b bf z dt">Font size comparison for Airbnb listing readability with blurred vision: 16px vs. 32px.</figcaption></figure></div></div></div><div class="ab ca"><div class="ch bg fy fz ga gb"><p id="43f5" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Choosing font scaling as the product accessibility strategy brought about a range of significant benefits that notably enhanced our platform’s overall user experience. Making that change using automation to convert to rem units made this transition easier. When looking at our overall issues count after these changes were site wide, more than 80% of our existing Resize Text issues were resolved. Moreover, we are seeing fewer new issues since then.</p><p id="1262" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">To conclude, our journey to enhance Resize Text on the web has been filled with valuable, practical lessons. From how we strategically apply rem units, to the role of tooling and automation, each lesson has been a vital step forward in elevating our user experience on Airbnb. We hope that by sharing our journey, we can help others navigate this transition more seamlessly. Our work is ongoing, and we are committed to continuously advancing Airbnb’s accessibility. If you’re passionate about such challenges, we invite you to <a class="af ou" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">explore career opportunities at Airbnb</a>.</p><p id="f1ac" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj">Thanks to:</p><ul class=""><li id="1cbd" class="no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol rk on oo bj">Alan Pinto Souza, Dennis Wilkins, Jimmy Guo, and Andrew Scheuermann for advice and technical review.</li><li id="4f78" class="no np gt nq b nr op nt nu nv oq nx ny nz or ob oc od os of og oh ot oj ok ol rk on oo bj">Sterling DeMille, Riley Glusker and Ryan Booth for being early product partners.</li><li id="bc65" class="no np gt nq b nr op nt nu nv oq nx ny nz or ob oc od os of og oh ot oj ok ol rk on oo bj">Jordanna Kwok, Sarah Alley and JN Vollmer for supporting the approach.</li><li id="75d1" class="no np gt nq b nr op nt nu nv oq nx ny nz or ob oc od os of og oh ot oj ok ol rk on oo bj">Veronica Reyes and Jamie Cristal for providing design support.</li></ul><p id="68cc" class="pw-post-body-paragraph no np gt nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gm bj"><em class="qi">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/rethinking-text-resizing-on-web-1047b12d2881</link>
      <guid>https://medium.com/airbnb-engineering/rethinking-text-resizing-on-web-1047b12d2881</guid>
      <pubDate>Thu, 16 May 2024 19:24:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Animations: Bringing the Host Passport to Life on iOS]]></title>
      <description><![CDATA[<div class="ab ca"><div class="ch bg fy fz ga gb"><div><div class="hu hv hw hx hy"></div><p id="9042" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">How Airbnb enabled hosts and guests to connect and introduce themselves through the Host Passport.</p><p id="a404" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj"><strong class="nc gu">By:</strong> <a class="af ny" href="https://www.linkedin.com/in/annelu1/" rel="noopener ugc nofollow" target="_blank">Anne Lu</a></p><figure class="oc od oe of og oh nz oa paragraph-image"><div role="button" tabindex="0" class="oi oj fi ok bg ol"><div class="nz oa ob"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*BmPtSglh_yBHzCxIn_0c_g.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*BmPtSglh_yBHzCxIn_0c_g.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*BmPtSglh_yBHzCxIn_0c_g.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*BmPtSglh_yBHzCxIn_0c_g.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*BmPtSglh_yBHzCxIn_0c_g.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*BmPtSglh_yBHzCxIn_0c_g.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*BmPtSglh_yBHzCxIn_0c_g.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*BmPtSglh_yBHzCxIn_0c_g.png 640w, https://miro.medium.com/v2/resize:fit:720/1*BmPtSglh_yBHzCxIn_0c_g.png 720w, https://miro.medium.com/v2/resize:fit:750/1*BmPtSglh_yBHzCxIn_0c_g.png 750w, https://miro.medium.com/v2/resize:fit:786/1*BmPtSglh_yBHzCxIn_0c_g.png 786w, https://miro.medium.com/v2/resize:fit:828/1*BmPtSglh_yBHzCxIn_0c_g.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*BmPtSglh_yBHzCxIn_0c_g.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*BmPtSglh_yBHzCxIn_0c_g.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="080c" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Introduction</h1><p id="944f" class="pw-post-body-paragraph na nb gt nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gm bj">In May 2023 we introduced the Host Passport as part of our <a class="af ny" href="https://www.airbnb.com/release/2023-summer" rel="noopener ugc nofollow" target="_blank">Summer Release</a>. We wanted to give Hosts a way to introduce themselves, and start building a more personal connection with their guests. To that end, we created the Host Passport, which appears in the bottom corner of each Private Room listing result with a photo of the Host on the cover. Guests can tap it to fully open the Host Passport and learn more about the Host and get a sense for the real live person they would be staying with.</p><figure class="oc od oe of og oh nz oa paragraph-image"><div class="nz oa pq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*6zf4AHmHqrCaX74j 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*6zf4AHmHqrCaX74j 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*6zf4AHmHqrCaX74j 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*6zf4AHmHqrCaX74j 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*6zf4AHmHqrCaX74j 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*6zf4AHmHqrCaX74j 1100w, https://miro.medium.com/v2/resize:fit:640/format:webp/0*6zf4AHmHqrCaX74j 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*6zf4AHmHqrCaX74j 640w, https://miro.medium.com/v2/resize:fit:720/0*6zf4AHmHqrCaX74j 720w, https://miro.medium.com/v2/resize:fit:750/0*6zf4AHmHqrCaX74j 750w, https://miro.medium.com/v2/resize:fit:786/0*6zf4AHmHqrCaX74j 786w, https://miro.medium.com/v2/resize:fit:828/0*6zf4AHmHqrCaX74j 828w, https://miro.medium.com/v2/resize:fit:1100/0*6zf4AHmHqrCaX74j 1100w, https://miro.medium.com/v2/resize:fit:640/0*6zf4AHmHqrCaX74j 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" /></picture></div><figcaption class="pr fe ps nz oa pt pu be b bf z dt">The Passport animation</figcaption></figure><p id="44a6" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">The Host Passport offers Hosts a way to introduce themselves and set guest expectations, and allows guests to quickly start discovering who they could be sharing a space with.</p><p id="c6a3" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Delivering this animation with high pixel accuracy, fluidity, high performance, and a spark of delight led us to encounter and solve many novel technical issues unique to each client platform that we support. While the Host Passport appears in the web, Android, and iOS apps, this article focuses specifically on the iOS implementation.</p><h1 id="96c8" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Implementing the Host Passport on iOS</h1><p id="1a8e" class="pw-post-body-paragraph na nb gt nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gm bj">While we’ve almost entirely <a class="af ny" rel="noopener" href="https://medium.com/airbnb-engineering/unlocking-swiftui-at-airbnb-ea58f50cde49">switched over to SwiftUI</a> when it comes to building new components and screens in our app, we opted to use UIKit for the passport animation. We did this for a couple of reasons. Firstly, at the time of this writing, SwiftUI does not have APIs supporting custom transitions and navigation patterns, so our screen navigation and transition layer remains in UIKit. And secondly, while <a class="af ny" href="https://developer.apple.com/documentation/swiftui/keyframes" rel="noopener ugc nofollow" target="_blank">keyframe timing</a> was introduced for SwiftUI animations with iOS 17, our version support extended back to iOS 15 at the time of release.</p><p id="4996" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">UIKit provides a ready-to-use <a class="af ny" href="https://developer.apple.com/documentation/uikit/animation_and_haptics" rel="noopener ugc nofollow" target="_blank">framework</a> that enables the development of smooth, polished animations. Combined with our <a class="af ny" rel="noopener" href="https://medium.com/airbnb-engineering/motion-engineering-at-scale-5ffabfc878">in-house</a> declarative transition framework, we were starting with a solid foundation that we could leverage to create complex animations. The work lay in bridging the gap between established patterns and our novel requirements; while we were already experienced in creating delightful two-dimensional animations, three-dimensional animation was uncharted territory.</p><p id="7415" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">The complexity of this animation lies in its many moving parts. The challenge lies not in animating single properties, but rather coordinating many for a cohesive effect that is not only functional, but also delightful.</p><h1 id="1869" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">The Passport</h1><h1 id="b7ff" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">The anchor point</h1><p id="c7b0" class="pw-post-body-paragraph na nb gt nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gm bj">The <a class="af ny" href="https://developer.apple.com/documentation/uikit/uiview/4051982-anchorpoint" rel="noopener ugc nofollow" target="_blank">anchor point</a> is a property of a view’s bounds, defaulting to the relative [0.5, 0.5], or exact center. <a class="af ny" href="https://developer.apple.com/documentation/quartzcore/1436524-catransform3drotate" rel="noopener ugc nofollow" target="_blank">Rotation animations</a> rotate around this anchor point, so by default, views rotate around their midpoints, which gives a card rotating effect rather than a page flipping one.</p><figure class="oc od oe of og oh nz oa paragraph-image"><div role="button" tabindex="0" class="oi oj fi ok bg ol"><div class="nz oa pv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*ODcsPy1UgzRpojU5 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*ODcsPy1UgzRpojU5 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*ODcsPy1UgzRpojU5 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*ODcsPy1UgzRpojU5 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*ODcsPy1UgzRpojU5 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*ODcsPy1UgzRpojU5 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*ODcsPy1UgzRpojU5 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*ODcsPy1UgzRpojU5 640w, https://miro.medium.com/v2/resize:fit:720/0*ODcsPy1UgzRpojU5 720w, https://miro.medium.com/v2/resize:fit:750/0*ODcsPy1UgzRpojU5 750w, https://miro.medium.com/v2/resize:fit:786/0*ODcsPy1UgzRpojU5 786w, https://miro.medium.com/v2/resize:fit:828/0*ODcsPy1UgzRpojU5 828w, https://miro.medium.com/v2/resize:fit:1100/0*ODcsPy1UgzRpojU5 1100w, https://miro.medium.com/v2/resize:fit:1400/0*ODcsPy1UgzRpojU5 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fe ps nz oa pt pu be b bf z dt">A rounded rectangle rotating around a center anchor point</figcaption></figure><p id="e0cb" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">To achieve the desired page rotation, we faced a dilemma with the anchor point. Shifting the anchor point to [0, 0.5] in the coordinate space could accomplish the page turning effect by shifting it to the view’s leading side, but that approach had the potential to disrupt other aspects of the animation — this is because the anchor point is used not only as the basis for rotation, but for other transforms, such as scaling and translation. Altering the anchor point for three-dimensional rotation has a knock-on effect on these other transforms, causing unexpected side effects we would then have to work around.</p><p id="0453" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">With this in mind, we used an alternative approach: instead of directly manipulating the anchor point, we created transparent views where the visible content occupied only half of the space. As the rotation occurs, the view seemingly rotates around the left edge, while still leveraging the default center point for the actual rotation.</p><p id="a456" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">With this, we are able to animate our book page rotation without introducing complications to the other transforms. See the example below, where there is a border added around the entire view, including the transparent part, to show its actual size.</p><figure class="oc od oe of og oh nz oa paragraph-image"><div role="button" tabindex="0" class="oi oj fi ok bg ol"><div class="nz oa pv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*alBa06VX2Ad7oKkM 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*alBa06VX2Ad7oKkM 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*alBa06VX2Ad7oKkM 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*alBa06VX2Ad7oKkM 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*alBa06VX2Ad7oKkM 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*alBa06VX2Ad7oKkM 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*alBa06VX2Ad7oKkM 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*alBa06VX2Ad7oKkM 640w, https://miro.medium.com/v2/resize:fit:720/0*alBa06VX2Ad7oKkM 720w, https://miro.medium.com/v2/resize:fit:750/0*alBa06VX2Ad7oKkM 750w, https://miro.medium.com/v2/resize:fit:786/0*alBa06VX2Ad7oKkM 786w, https://miro.medium.com/v2/resize:fit:828/0*alBa06VX2Ad7oKkM 828w, https://miro.medium.com/v2/resize:fit:1100/0*alBa06VX2Ad7oKkM 1100w, https://miro.medium.com/v2/resize:fit:1400/0*alBa06VX2Ad7oKkM 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fe ps nz oa pt pu be b bf z dt">A rounded rectangle with a red half and a transparent half rotating around a center anchor point</figcaption></figure><h1 id="07a4" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Page composition</h1><p id="5b20" class="pw-post-body-paragraph na nb gt nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gm bj">With the rotation solved, we next had to think about how to compose the view to look like a book. We ended up accomplishing that effect by using a compound view. At a basic level, the booklet is composed of a front page and two inside pages. That meant we needed three separate views:</p><figure class="oc od oe of og oh nz oa paragraph-image"><div role="button" tabindex="0" class="oi oj fi ok bg ol"><div class="nz oa pv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*a5qwX-rVG7VwPrG8 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*a5qwX-rVG7VwPrG8 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*a5qwX-rVG7VwPrG8 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*a5qwX-rVG7VwPrG8 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*a5qwX-rVG7VwPrG8 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*a5qwX-rVG7VwPrG8 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*a5qwX-rVG7VwPrG8 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*a5qwX-rVG7VwPrG8 640w, https://miro.medium.com/v2/resize:fit:720/0*a5qwX-rVG7VwPrG8 720w, https://miro.medium.com/v2/resize:fit:750/0*a5qwX-rVG7VwPrG8 750w, https://miro.medium.com/v2/resize:fit:786/0*a5qwX-rVG7VwPrG8 786w, https://miro.medium.com/v2/resize:fit:828/0*a5qwX-rVG7VwPrG8 828w, https://miro.medium.com/v2/resize:fit:1100/0*a5qwX-rVG7VwPrG8 1100w, https://miro.medium.com/v2/resize:fit:1400/0*a5qwX-rVG7VwPrG8 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fe ps nz oa pt pu be b bf z dt">A view that rotates like a folding page with View 1 (front), View 2 (inner left) and View 3 (inner right)</figcaption></figure><p id="9327" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">By stitching them together, we create the Passport booklet.</p><p id="9008" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">To create the impression of a page flip, we needed to employ another trick; while real life pages have a front and a back, the same is not true of a view. Therefore, in order to make it <em class="pw">look</em> like a page turning, we timed it so that during the page turn, the front view is swapped for the back view at the exact point when the page is completely orthogonal to the viewer’s perspective. This creates the illusion of a front and a back. Et voila!</p><figure class="oc od oe of og oh nz oa paragraph-image"><div role="button" tabindex="0" class="oi oj fi ok bg ol"><div class="nz oa pv"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*eec-Jenr8gkmmowb 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*eec-Jenr8gkmmowb 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*eec-Jenr8gkmmowb 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*eec-Jenr8gkmmowb 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*eec-Jenr8gkmmowb 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*eec-Jenr8gkmmowb 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*eec-Jenr8gkmmowb 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*eec-Jenr8gkmmowb 640w, https://miro.medium.com/v2/resize:fit:720/0*eec-Jenr8gkmmowb 720w, https://miro.medium.com/v2/resize:fit:750/0*eec-Jenr8gkmmowb 750w, https://miro.medium.com/v2/resize:fit:786/0*eec-Jenr8gkmmowb 786w, https://miro.medium.com/v2/resize:fit:828/0*eec-Jenr8gkmmowb 828w, https://miro.medium.com/v2/resize:fit:1100/0*eec-Jenr8gkmmowb 1100w, https://miro.medium.com/v2/resize:fit:1400/0*eec-Jenr8gkmmowb 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fe ps nz oa pt pu be b bf z dt">A rotating rounded rectangle that changes from blue to red when it is perpendicular to the screen</figcaption></figure><h1 id="a92b" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Integrating with our Animation Framework</h1><p id="1f1d" class="pw-post-body-paragraph na nb gt nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gm bj">At this point, we had a passport booklet with the ability to flip open in three dimensions.</p><p id="77b1" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">In order to accomplish the next step in the animation we needed to integrate our book animation with our <a class="af ny" rel="noopener" href="https://medium.com/airbnb-engineering/motion-engineering-at-scale-5ffabfc878">declarative animation framework</a>, which handled transitioning the animating passport from the listing results view onto the modal view. Our animation framework allows us to perform a shared element transition, where a view animates seamlessly between two separate screens, in just a few lines of code.</p><p id="c572" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">First, we created a transition definition that describes the type of animation we wanted:</p><pre class="oc od oe of og px py pz bo qa ba bj">let passportTransition: TransitionDefinition = [<br />  SharedElementIdentifiers.Passport.passport(listingId): .sharedElement<br />]</pre><p id="f827" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Next, we attached these identifiers to the source view (the passport in the listing search results) and the destination (the open passport card in the context sheet.) We set the transition definition on the modal presentation, and from there, the framework created the animation that moved our passport view from its starting location in the listing results to its final location in the modal.</p><p id="1736" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Under typical circumstances, our framework captures a “snapshot” of the view by rendering it as a static image. The snapshot is then animated from the initial position to the final position, while the original source and destination views are hidden during the animation. This allows us to play the animation of the view moving from one place to another in a performant way while keeping the view hierarchy intact.</p><p id="7691" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">In our case, however, a static snapshot didn’t have the functionality we needed, which was the ability to play the page flip animation alongside the shared element transition. Therefore, we created a custom snapshot that we used in place of the default static snapshot. This custom snapshot was a copy of the view that did have animation capabilities, that we then triggered to play alongside the animated transition so that they would be perfectly in sync. Enter <a class="af ny" href="https://developer.apple.com/documentation/uikit/uiviewpropertyanimator" rel="noopener ugc nofollow" target="_blank">UIViewPropertyAnimator</a>: a class that allows us to define animation blocks and dynamically control their playback. It provides the flexibility to start, stop, or modify animations in real-time.</p><p id="6f0c" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">It neatly encapsulated our animations within a single object, which could then be passed along to our animation framework. As our framework handled the screen to screen transition, it triggered the custom animation to play in sync with that transition.</p><h1 id="b0f0" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Timing</h1><p id="a925" class="pw-post-body-paragraph na nb gt nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gm bj">It isn’t only <em class="pw">where</em> a view moves that determines realism, but also very importantly <em class="pw">when</em>. The passport opens in the span of a moment, but the simple elegance belies the complexity underneath.</p><p id="da13" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">On a closer look, our animation consists of many synchronized individual animations. The passport grows in size, moves along the x and y axis, rotates its pages in 3D space, and shadows move to simulate light and movement. To get things just right, we use a separate timing curve for each property.</p><p id="8013" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">But we need even more specificity than that; our design calls for these to start and stop at different points along the animation duration. For that, we time specific events to relative points within the timing curve via keyframes. To expand on our earlier example, here is our animator with keyframes set.</p><pre class="oc od oe of og px py pz bo qa ba bj">let animator = UIViewPropertyAnimator(duration: 2.0, curve: .easeInOut) {// Enable keyframe animations, inheriting the duration of the <br />  // parent property animator<br />  UIView.animateKeyframes(withDuration: 0, delay: 0) {// At the start of the animation, translate the view 100 pixels downwards<br />    UIView.addKeyframe(withRelativeStartTime: 0, relativeDuration: 0.5) {<br />      cardView.transform = CGAffineTransform(translationX: 0, y: 100)<br />    }// At the halfway point, flip the color to coincide with the turning<br />    // point of our view.<br />    UIView.addKeyframe(withRelativeStartTime: 0.5, relativeDuration: 0) {<br />      cardView.backgroundColor = .red<br />    }// Return to original position<br />    UIView.addKeyframe(withRelativeStartTime: 0.5, relativeDuration: 0.5) {<br />      cardView.transform = .identity<br />    }<br />  }<br />}animator.startAnimation()</pre><p id="eb26" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Next, let’s take a closer look at spring timings and their unique characteristics. When creating animations, we have the option of different types of easing functions for a naturalistic feel.</p><p id="8912" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Easing functions like linear and cubic are common timing curves, as depicted in the graphs below. They give us the ability to specify the speed of our animation over time.</p><p id="7faf" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Linear</p></div></div><div class="oh"><div class="ab ca"><div class="mc qg md qh me qi ce qj cf qk ch bg"><div class="oc od oe of og ab kz"><figure class="ly oh ql qm qn qo qp paragraph-image"><div role="button" tabindex="0" class="oi oj fi ok bg ol"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*pqb4ArPLkdTxYUwF 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*pqb4ArPLkdTxYUwF 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*pqb4ArPLkdTxYUwF 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*pqb4ArPLkdTxYUwF 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*pqb4ArPLkdTxYUwF 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*pqb4ArPLkdTxYUwF 1100w, https://miro.medium.com/v2/resize:fit:1246/format:webp/0*pqb4ArPLkdTxYUwF 1246w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 623px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*pqb4ArPLkdTxYUwF 640w, https://miro.medium.com/v2/resize:fit:720/0*pqb4ArPLkdTxYUwF 720w, https://miro.medium.com/v2/resize:fit:750/0*pqb4ArPLkdTxYUwF 750w, https://miro.medium.com/v2/resize:fit:786/0*pqb4ArPLkdTxYUwF 786w, https://miro.medium.com/v2/resize:fit:828/0*pqb4ArPLkdTxYUwF 828w, https://miro.medium.com/v2/resize:fit:1100/0*pqb4ArPLkdTxYUwF 1100w, https://miro.medium.com/v2/resize:fit:1246/0*pqb4ArPLkdTxYUwF 1246w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 623px" /></picture></div></figure><figure class="ly oh qq qm qn qo qp paragraph-image"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*4DmAwAkcXW75X4Zk 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*4DmAwAkcXW75X4Zk 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*4DmAwAkcXW75X4Zk 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*4DmAwAkcXW75X4Zk 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*4DmAwAkcXW75X4Zk 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*4DmAwAkcXW75X4Zk 1100w, https://miro.medium.com/v2/resize:fit:640/format:webp/0*4DmAwAkcXW75X4Zk 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*4DmAwAkcXW75X4Zk 640w, https://miro.medium.com/v2/resize:fit:720/0*4DmAwAkcXW75X4Zk 720w, https://miro.medium.com/v2/resize:fit:750/0*4DmAwAkcXW75X4Zk 750w, https://miro.medium.com/v2/resize:fit:786/0*4DmAwAkcXW75X4Zk 786w, https://miro.medium.com/v2/resize:fit:828/0*4DmAwAkcXW75X4Zk 828w, https://miro.medium.com/v2/resize:fit:1100/0*4DmAwAkcXW75X4Zk 1100w, https://miro.medium.com/v2/resize:fit:640/0*4DmAwAkcXW75X4Zk 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" /></picture><figcaption class="pr fe ps nz oa pt pu be b bf z dt qr fi qs qt">A linear graph (left), a rounded rectangle translating on the Y axis with a linear timing curve (right)</figcaption></figure></div></div></div></div><div class="ab ca"><div class="ch bg fy fz ga gb"><p id="d552" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Cubic</p></div></div><div class="oh"><div class="ab ca"><div class="mc qg md qh me qi ce qj cf qk ch bg"><div class="oc od oe of og ab kz"><figure class="ly oh qu qm qn qo qp paragraph-image"><div role="button" tabindex="0" class="oi oj fi ok bg ol"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*U6JqciN5FCyCQtMr 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*U6JqciN5FCyCQtMr 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*U6JqciN5FCyCQtMr 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*U6JqciN5FCyCQtMr 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*U6JqciN5FCyCQtMr 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*U6JqciN5FCyCQtMr 1100w, https://miro.medium.com/v2/resize:fit:1208/format:webp/0*U6JqciN5FCyCQtMr 1208w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 604px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*U6JqciN5FCyCQtMr 640w, https://miro.medium.com/v2/resize:fit:720/0*U6JqciN5FCyCQtMr 720w, https://miro.medium.com/v2/resize:fit:750/0*U6JqciN5FCyCQtMr 750w, https://miro.medium.com/v2/resize:fit:786/0*U6JqciN5FCyCQtMr 786w, https://miro.medium.com/v2/resize:fit:828/0*U6JqciN5FCyCQtMr 828w, https://miro.medium.com/v2/resize:fit:1100/0*U6JqciN5FCyCQtMr 1100w, https://miro.medium.com/v2/resize:fit:1208/0*U6JqciN5FCyCQtMr 1208w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 604px" /></picture></div></figure><figure class="ly oh qv qm qn qo qp paragraph-image"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*iyWGq957TIxw6iin 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*iyWGq957TIxw6iin 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*iyWGq957TIxw6iin 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*iyWGq957TIxw6iin 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*iyWGq957TIxw6iin 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*iyWGq957TIxw6iin 1100w, https://miro.medium.com/v2/resize:fit:640/format:webp/0*iyWGq957TIxw6iin 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*iyWGq957TIxw6iin 640w, https://miro.medium.com/v2/resize:fit:720/0*iyWGq957TIxw6iin 720w, https://miro.medium.com/v2/resize:fit:750/0*iyWGq957TIxw6iin 750w, https://miro.medium.com/v2/resize:fit:786/0*iyWGq957TIxw6iin 786w, https://miro.medium.com/v2/resize:fit:828/0*iyWGq957TIxw6iin 828w, https://miro.medium.com/v2/resize:fit:1100/0*iyWGq957TIxw6iin 1100w, https://miro.medium.com/v2/resize:fit:640/0*iyWGq957TIxw6iin 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" /></picture><figcaption class="pr fe ps nz oa pt pu be b bf z dt qw fi qx qt">A cubic graph (left), a rounded rectangle translating on the Y axis with a cubic timing curve (right)</figcaption></figure></div></div></div></div><div class="ab ca"><div class="ch bg fy fz ga gb"><p id="3d41" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">For a more advanced timing, we have the option to use springs. Spring functions are grounded in principles from real-life physics, giving animations a realistic sense of elasticity and bounce.</p><p id="9a36" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Spring</p></div></div><div class="oh"><div class="ab ca"><div class="mc qg md qh me qi ce qj cf qk ch bg"><div class="oc od oe of og ab kz"><figure class="ly oh qy qm qn qo qp paragraph-image"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*i3KjeVrm25I4ZAHx 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*i3KjeVrm25I4ZAHx 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*i3KjeVrm25I4ZAHx 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*i3KjeVrm25I4ZAHx 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*i3KjeVrm25I4ZAHx 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*i3KjeVrm25I4ZAHx 1100w, https://miro.medium.com/v2/resize:fit:1244/format:webp/0*i3KjeVrm25I4ZAHx 1244w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 622px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*i3KjeVrm25I4ZAHx 640w, https://miro.medium.com/v2/resize:fit:720/0*i3KjeVrm25I4ZAHx 720w, https://miro.medium.com/v2/resize:fit:750/0*i3KjeVrm25I4ZAHx 750w, https://miro.medium.com/v2/resize:fit:786/0*i3KjeVrm25I4ZAHx 786w, https://miro.medium.com/v2/resize:fit:828/0*i3KjeVrm25I4ZAHx 828w, https://miro.medium.com/v2/resize:fit:1100/0*i3KjeVrm25I4ZAHx 1100w, https://miro.medium.com/v2/resize:fit:1244/0*i3KjeVrm25I4ZAHx 1244w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 622px" /></picture></figure><figure class="ly oh qz qm qn qo qp paragraph-image"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*VdC34y_MVHtkwVu9 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*VdC34y_MVHtkwVu9 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*VdC34y_MVHtkwVu9 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*VdC34y_MVHtkwVu9 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*VdC34y_MVHtkwVu9 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*VdC34y_MVHtkwVu9 1100w, https://miro.medium.com/v2/resize:fit:640/format:webp/0*VdC34y_MVHtkwVu9 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*VdC34y_MVHtkwVu9 640w, https://miro.medium.com/v2/resize:fit:720/0*VdC34y_MVHtkwVu9 720w, https://miro.medium.com/v2/resize:fit:750/0*VdC34y_MVHtkwVu9 750w, https://miro.medium.com/v2/resize:fit:786/0*VdC34y_MVHtkwVu9 786w, https://miro.medium.com/v2/resize:fit:828/0*VdC34y_MVHtkwVu9 828w, https://miro.medium.com/v2/resize:fit:1100/0*VdC34y_MVHtkwVu9 1100w, https://miro.medium.com/v2/resize:fit:640/0*VdC34y_MVHtkwVu9 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" /></picture><figcaption class="pr fe ps nz oa pt pu be b bf z dt ra fi rb qt">A spring graph (left), a rounded rectangle translating along the Y axis with a spring timing curve (right)</figcaption></figure></div></div></div></div><div class="ab ca"><div class="ch bg fy fz ga gb"><p id="5ea2" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">However, unlike linear and cubic functions, the end time of a spring animation is not always clear-cut. Spring timings have a long tail in which the oscillations gradually diminish in size, while theoretically never reaching rest. Practically speaking, this means that animations using spring timings can have extended durations, which can make them feel less responsive.</p><p id="7b09" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Additionally, our spring timing had to be tuned to work with the overall transition animation timing. As mentioned above, our animation framework uses a snapshot during the animated transition before switching the snapshot for the destination view at the conclusion of the animation. This means that the snapshot and the destination view must be in the same position at the time of the switch, to ensure a seamless changeover.</p><p id="1b68" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Below is an example of a mismatch in sync: on the left, the passport snaps into its final position at the last moment, whereas on the right it moves smoothly to its final destination.</p></div></div><div class="oh"><div class="ab ca"><div class="mc qg md qh me qi ce qj cf qk ch bg"><div class="oc od oe of og ab kz"><figure class="ly oh rc qm qn qo qp paragraph-image"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*MMEaB0ikeTlN_gXH 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*MMEaB0ikeTlN_gXH 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*MMEaB0ikeTlN_gXH 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*MMEaB0ikeTlN_gXH 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*MMEaB0ikeTlN_gXH 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*MMEaB0ikeTlN_gXH 1100w, https://miro.medium.com/v2/resize:fit:640/format:webp/0*MMEaB0ikeTlN_gXH 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*MMEaB0ikeTlN_gXH 640w, https://miro.medium.com/v2/resize:fit:720/0*MMEaB0ikeTlN_gXH 720w, https://miro.medium.com/v2/resize:fit:750/0*MMEaB0ikeTlN_gXH 750w, https://miro.medium.com/v2/resize:fit:786/0*MMEaB0ikeTlN_gXH 786w, https://miro.medium.com/v2/resize:fit:828/0*MMEaB0ikeTlN_gXH 828w, https://miro.medium.com/v2/resize:fit:1100/0*MMEaB0ikeTlN_gXH 1100w, https://miro.medium.com/v2/resize:fit:640/0*MMEaB0ikeTlN_gXH 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" /></picture></figure><figure class="ly oh rc qm qn qo qp paragraph-image"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*6zf4AHmHqrCaX74j 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*6zf4AHmHqrCaX74j 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*6zf4AHmHqrCaX74j 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*6zf4AHmHqrCaX74j 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*6zf4AHmHqrCaX74j 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*6zf4AHmHqrCaX74j 1100w, https://miro.medium.com/v2/resize:fit:640/format:webp/0*6zf4AHmHqrCaX74j 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*6zf4AHmHqrCaX74j 640w, https://miro.medium.com/v2/resize:fit:720/0*6zf4AHmHqrCaX74j 720w, https://miro.medium.com/v2/resize:fit:750/0*6zf4AHmHqrCaX74j 750w, https://miro.medium.com/v2/resize:fit:786/0*6zf4AHmHqrCaX74j 786w, https://miro.medium.com/v2/resize:fit:828/0*6zf4AHmHqrCaX74j 828w, https://miro.medium.com/v2/resize:fit:1100/0*6zf4AHmHqrCaX74j 1100w, https://miro.medium.com/v2/resize:fit:640/0*6zf4AHmHqrCaX74j 640w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 320px" /></picture><figcaption class="pr fe ps nz oa pt pu be b bf z dt rd fi re qt">The Passport animating onto the modal with a snap on the end (left), the Passport animating smoothly onto the modal (right)</figcaption></figure></div></div></div></div><div class="ab ca"><div class="ch bg fy fz ga gb"><p id="47e2" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Switching over too early resulted in the jump; switching over too late made the animation feel sluggish. The timing had to balance these considerations: the timing had to achieve the right feel, while landing in the right place at the end of the modal transition.</p><p id="367c" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">To this end, we tuned the springs extensively so that at the conclusion of the modal opening animation, the X and Y coordinates of the Passport animation aligned perfectly at the cutoff, despite the long tail of the spring curve technically extending beyond the transition duration.</p><h1 id="fdaa" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Conclusion</h1><p id="6b19" class="pw-post-body-paragraph na nb gt nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gm bj">And there we have it! Like any magic trick, a lot of behind-the-scenes effort goes into making things look and feel effortless.</p><p id="cca0" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">Now that we have these new insights, we’re looking forward to bringing even more delightful animations to our applications in the future. If you share our excitement and are interested in contributing to this or other projects, we invite you to explore the career opportunities available at <a class="af ny" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">our careers page</a>.</p><p id="e826" class="pw-post-body-paragraph na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gm bj">We hope you found this exploration of the iOS implementation of the Host Passport insightful. To learn more about the declarative transition framework that powers advanced transitions like this throughout Airbnb’s iOS app, you can read our previous post: “<a class="af ny" rel="noopener" href="https://medium.com/airbnb-engineering/motion-engineering-at-scale-5ffabfc878">Motion Engineering at Scale</a>”.</p><h1 id="c821" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Acknowledgments</h1><p id="4af9" class="pw-post-body-paragraph na nb gt nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gm bj">Thanks to:</p><ul class=""><li id="45a0" class="na nb gt nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx rf rg rh bj">Cal Stephens</li><li id="40ba" class="na nb gt nc b nd ri nf ng nh rj nj nk nl rk nn no np rl nr ns nt rm nv nw nx rf rg rh bj">Matthew Cheok</li><li id="f10f" class="na nb gt nc b nd ri nf ng nh rj nj nk nl rk nn no np rl nr ns nt rm nv nw nx rf rg rh bj">Alejandro Erviti</li><li id="ecbc" class="na nb gt nc b nd ri nf ng nh rj nj nk nl rk nn no np rl nr ns nt rm nv nw nx rf rg rh bj">Julian Adams</li><li id="88ff" class="na nb gt nc b nd ri nf ng nh rj nj nk nl rk nn no np rl nr ns nt rm nv nw nx rf rg rh bj">Sergii Rudenko</li><li id="fdf4" class="na nb gt nc b nd ri nf ng nh rj nj nk nl rk nn no np rl nr ns nt rm nv nw nx rf rg rh bj">Carol Leung</li><li id="acc9" class="na nb gt nc b nd ri nf ng nh rj nj nk nl rk nn no np rl nr ns nt rm nv nw nx rf rg rh bj">Jeduan Cornejo</li><li id="4f51" class="na nb gt nc b nd ri nf ng nh rj nj nk nl rk nn no np rl nr ns nt rm nv nw nx rf rg rh bj">Marjorie Kasten</li></ul><h1 id="c847" class="on oo gt be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">****************</h1><p id="6819" class="pw-post-body-paragraph na nb gt nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gm bj"><em class="pw">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/animations-bringing-the-host-passport-to-life-on-ios-72856aea68a7</link>
      <guid>https://medium.com/airbnb-engineering/animations-bringing-the-host-passport-to-life-on-ios-72856aea68a7</guid>
      <pubDate>Tue, 07 May 2024 19:15:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Airbnb Brandometer: Powering Brand Perception Measurement on Social Media Data with AI]]></title>
      <description><![CDATA[<p><strong>How we quantify brand perceptions from social media platforms through deep learning</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*VWesH2nPt7K2bRIu"></figure><p>By <a href="https://medium.com/u/09e8a5894d7f">Tiantian Zhang</a>, <a href="https://medium.com/u/865e093fb5f3">Shuai Shao (Shawn)</a></p><h3>Introduction</h3><p>At Airbnb, we have developed Brandometer, a state-of-the-art natural language understanding (NLU) technique for understanding brand perception based on social media data.</p><p>Brand perception refers to the general feelings and experiences of customers with a company. Quantitatively, measuring brand perception is an extremely challenging task. Traditionally, we rely on customer surveys to find out what customers think about a company. The downsides of such a qualitative study is the bias in sampling and the limitation in data scale. Social media data, on the other hand, is the largest consumer database where users share their experiences and is the ideal complementary consumer data to capture brand perceptions.</p><p>Compared to traditional approaches to extract concurrency and count-based top relevant topics, Brandometer learns <a href="https://en.wikipedia.org/wiki/Word_embedding">word embeddings</a> and utilizes embedding distances to measure relatedness of brand perceptions (e.g., ‘belonging’, ‘connected’, ‘reliable’). Word embedding represents words in the form of real-valued vectors, and it performs well in reserving semantic meanings and relatedness of words. Word embeddings obtained from deep neural networks are arguably the most popular and evolutionary approaches in NLU. We explored a variety of word embedding models, from quintessential algorithms <a href="https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf">Word2Vec</a> and <a href="https://arxiv.org/abs/1607.04606">FastText</a>, to the latest language model <a href="https://arxiv.org/abs/2006.03654">DeBERTa</a>, and compared them in terms of generating reliable brand perception scores.</p><p>For concepts represented as words, we use similarity between its embedding and that of “Airbnb” to measure how important the concept is with respect to the Airbnb brand, which is named as Perception Score. Brand Perception is defined as <a href="https://en.wikipedia.org/wiki/Cosine_similarity">Cosine Similarity</a> between Airbnb and the specific keyword:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/301/1*fGrdGlIidRgdt6jT0XavYg.gif"></figure><p>where</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/331/1*cB95joQHnMYOsbcrrIWLMQ.gif"><figcaption>Eq. 1</figcaption></figure><p>In this blog post, we will introduce how we process and understand social media data, capture brand perceptions via deep learning and how to ‘convert’ the cosine similarities to calibrated Brandometer metrics. We will also share the insights derived from Brandometer metrics.</p><h3>Brandometer Methodology</h3><h4>Problem Setup and Data</h4><p>In order to measure brand perception on social media, we assessedall Airbnb related mentions from 19 platforms (e.g., X — formerly known as Twitter, Facebook, Reddit, etc) and generated word embeddings with state-of-the-art models.</p><p>In order to use Social media data to generate meaningful word embeddings for the purpose of measuring brand perception, we conquered two challenges:</p><ul><li><strong>Quality</strong>: Social media posts are mostly user-generated with varying content such as status sharing and reviews, and can be very noisy.</li><li><strong>Quantity</strong>: Social media post sparsity is another challenge. Considering that it typically requires some time for social media users to generate data in response to certain activities and events, a monthly rolling window maintains a good balance of promptness and detectability. Our monthly dataset is relatively small (around 20 million words) as compared to a typical dataset used to train good quality word embeddings (e.g., about 100 billion words for Google News Word2Vec model). Warm-start from pre-trained models didn’t help since the in-domain data barely moved the learned embeddings.</li></ul><p>We developed multiple data cleaning processes to improve data quality. At the same time, we innovated the modeling techniques to mitigate the impact on word embedding quality due to data quantity and quality.</p><p>In addition to data, we explored and compared multiple word embedding training techniques with the goal to generate reliable brand perception scores.</p><h4>Word2Vec</h4><p><a href="https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf">Word2Vec</a> is by far the simplest and most widely used word embedding model since 2013. We started with building CBOW-based Word2Vec models using <a href="https://radimrehurek.com/gensim/">Gensim</a>. Word2Vec produced decent in-domain word embeddings, and more importantly, the concept of analogies. In our domain-specific word embeddings, we are able to capture analogies in the Airbnb domain, such as <em>“host” — “provide” + “guest” ~= “need”</em>, <em>“city” — “mall” + “nature” ~= “park”</em>.</p><h4>FastText</h4><p>FastText takes into account the internal structure of words, and is more robust to out-of-vocabulary words and smaller datasets. Moreover, as inspired by <a href="https://analyticsindiamag.com/guide-to-sense2vec-contextually-keyed-word-vectors-for-nlp/">Sense2Vec</a>, we associate words with sentiments (i.e., POSITIVE, NEGATIVE, NEUTRAL), which forms brand perception concepts on the sentiment levels.</p><h4>DeBERTa</h4><p>Recent progress in transformer-based language models (e.g., <a href="https://arxiv.org/abs/1810.04805">BERT</a>) has significantly improved the performance of NLU tasks with the advantage of generating contextualized word embeddings. We developed <a href="https://arxiv.org/abs/2006.03654">DeBERTa</a> based word embeddings, which works better with smaller dataset and pays more attention to surrounding context via disentangled attention mechanisms. We trained everything from scratch (including tokenizer) using <a href="https://huggingface.co/docs/transformers/index">Transformers</a>, and the concatenated last attention layer embeddings resulted in the best word embeddings for our case.</p><h4>Brand Perception Score Stabilization and Calibration</h4><p>The variability of word embeddings has been widely studied (<a href="https://arxiv.org/pdf/2104.08433.pdf">Borah, 2021</a>). The causes range from the underlying stochastic nature of deep learning models (e.g., random initialization of word embeddings, embedding training which leads to local optimum for global optimization criteria) to the quantity and quality changes of data corpus across time.</p><p>With Brandometer, we need to reduce the variability in embedding distances to generate stable time series tracking. Stable embedding distances helped preserve the inherent patterns and structures present in the time series data, and hence it contributes to better predictability of the tracking process. Additionally, it made the tracking process more robust to noisy fluctuations. We studied the influential factors and took the following steps to reduce:</p><ol><li>Score averaging over repetitive training with bootstrap sampling</li><li>Rank-based perception score</li></ol><blockquote><strong>Score averaging over repetitive training with upsampling</strong></blockquote><p>For each month’s data, we trained <em>N</em> models with the same hyper-parameters, and took the average of <em>N</em> perception scores as the final score for each concept. Meanwhile, we did upsampling to make sure that each model iterated on an equal number of data points across months.</p><p>We defined variability as:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/461/1*pJeFFkML9OLgYyYVAChJIA.gif"><figcaption>Eq.2</figcaption></figure><p>where</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/334/1*O9bfbXD2k2yCGHAl0LmaXg.gif"></figure><p><em>CosSim(w)</em> refers to the cosine similarity based perception score defined in Eq. 1, <em>A</em> refers to the algorithm, <em>M</em> refers to the time window (i.e. month), <em>V</em> refers to the vocabulary and <em>|V|</em> is the vocabulary size, and <em>n</em> refers to the number of repetitively trained models.</p><p>As <em>N</em> approaches 30, the score variability values converge and settle within a narrow interval. Hence, we picked <em>N = 30</em> for all.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/657/1*2GJvc2uZGLMwu4XQs01FBQ.png"></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*lNjBT74QexfOha39"><figcaption>Figure 1. Score variability changes of the FastText models across months in a year with increasing <em>N</em>.</figcaption></figure><blockquote><strong>Rank-based perception score</strong></blockquote><p>Based on <a href="https://aclanthology.org/Q18-1008/">Maria Antoniak’s</a> work, we used the overlap between nearest neighbors to measure the stability of word embeddings, since the relative distances matter more than the absolute distance values in downstream tasks. Therefore, we also developed rank-based scores, which shows greater stability as compared to similarity-based scores.</p><p>For each word, we first ranked them in descending order of cosine similarity via Eq. 1. The rank-based similarity score is then computed as <em>1/rank(w)</em> where <em>w∈V</em>. More relevant concepts will have higher rank-based perception scores.</p><p>The score variability is defined the same as <em>Variability(A, M, V)</em> in Eq. 2 except that</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/347/1*WTrXi2M45e5zIDJg8gl0GA.gif"></figure><p>where <em>RankSim(w)</em> refers to the rank based perception score. With rank-based scores, when <em>N</em> approaches to <em>30</em>, the score variability values converge to a much narrower interval especially for DeBERTa.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*hVNXGg1bCkaKbfBxZx3qyw.png"></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*TOGMO-HG93AXV2UD"><figcaption>Figure 2. Rank-based score variability changes of the FastText models across months in 2020 with increasing <em>N</em>.</figcaption></figure><h4>Selection of Score Output by Designed Metrics</h4><p>One challenge of this project was that we didn’t have a simple and ultimate way to conclude which score output was better since there is no objective ‘truth’ of brand perception. Instead, we defined a new metric to learn some characteristics of the score.</p><p><strong>Average Variance Across Different Period (AVADP)</strong></p><ul><li>We first picked the group of top relevant brand perceptions for Airbnb: ‘host,’ ‘vacation,’ ‘rental,’ ‘love,’ ‘stay,’ ‘home,’ ‘booking,’ ‘travel,’ ‘guest’.</li><li>Higher value indicates more fluctuations across different periods — likely a bad thing, because the selected brand perception is assumed to be relatively stable and hence should not vary too much month by month.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/854/1*uZSnN3IM0N3dY13P9-jqrA.png"></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/296/1*JXBekuSCh3AvKzgLaFzZ6w.gif"><figcaption>i<em> ∈ (1,n)</em>, <em>n</em> as number of selected top perceptions, T as number of periods</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*FiVCJ9lR5mc-KRy0rwfaHQ.png"></figure><p>We checked these statistics on the calibrated results as shown above. We can see that the ranked-based score is the winner as compared to similarity-based scores:</p><ul><li><strong>Lower AVADP</strong>: More fluctuations than the non-ranked across a different period — likely a good thing, because the selected brand perception is assumed to be relatively stable and hence should not vary too much month by month.</li></ul><h3>Use Cases of Brandometer</h3><p>Though we set out to solve the problem of brand measurement, we believe use cases can go above and beyond:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*Uw6wXQ3ph8IgLVk9"></figure><h4>Use Cases Deep Dive</h4><p><strong>Industry Analysis: Top Brand Perception among Key Players [Monthly Top Perception]</strong></p><p>With top perceptions such as “Stay” and “Home,” Airbnb provides a brand image of “<strong>belonging</strong>”, echoing our mission statement and unique supply inventory, while other companies have “Rental,” “Room,” “Booking,” a description of functionality, not human sensation.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*y6uLOogOuL9xzr7Zo5V7iw.png"></figure><p><strong>Top Emerging Perception reveals major events discussed online [Monthly Top Perception]</strong></p><p>The <strong>Top 10</strong> Perceptions are generally stable month to month. The top standing perceptions include</p><ul><li>Home, Host, Stay, Travel, Guest, Rental, etc.</li></ul><p>Meanwhile, we use Brandometer to monitor emerging perceptions that jump to the top list, which may reflect major events associated with the brand or user preference changes.</p><p><strong>Major Campaign Monitor (Time Series Tracking)</strong></p><p>Businesses create campaigns to promote products and expand the brand image. We were able to capture a perception change on one specific Brand Theme after a related campaign.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/729/0*iax9id0qhOGbp1vT"><figcaption>Figure 3. Example of brand perception change due to a campaign. Scores shown here are based on Calibrated Rank-based Score.</figcaption></figure><p>These use cases are just the beginning. Essentially, this is an innovative way of gathering massive online input as we learn the needs and perception of the community. We will constantly reflect on how we leverage these insights to continually improve the Airbnb experience for our community.</p><h3>Next Steps</h3><p>Airbnb’s innovative Brandometer has already demonstrated success in capturing brand perception from social media data. There are several directions for future improvement:</p><ul><li>Better content segmentation for clearer and more concise insights.</li><li>Develop more metrics reflecting social media brand perception.</li><li>Enhance data foundation, not just Airbnb, but other companies in the same market segment to get more comprehensive insights.</li></ul><p>If this kind of work sounds appealing to you, check out our <a href="https://careers.airbnb.com/">open roles</a> — we’re hiring!</p><h3>Acknowledgments</h3><p>Thanks to Mia Zhao, Bo Zeng, Cassie Cao for contributing the best ideas on improving and landing Airbnb Brandometer. Thanks to Jon Young, Narin Leininger, Allison Frelinger for the support of social media data consolidation. Thanks to Linsha Chen, Sam Barrows, Hannah Jeton, and Irina Azu who provide feedback and suggestions. Thanks to Lianghao Li, Kelvin Xiong, Nathan Triplett, Joy Zhang, Andy Yasutake for reviewing and polishing the blog post content and all the great suggestions. Thank Joy Zhang, Tina Su, Andy Yasutake for leadership support!</p><p>Special thanks to Joy Zhang, who initiated the idea, for all the inspiring conversations, continuous guidance and support!</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=c83019408051" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/airbnb-brandometer-powering-brand-perception-measurement-on-social-media-data-with-ai-c83019408051">Airbnb Brandometer: Powering Brand Perception Measurement on Social Media Data with AI</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/airbnb-brandometer-powering-brand-perception-measurement-on-social-media-data-with-ai-c83019408051</link>
      <guid>https://medium.com/airbnb-engineering/airbnb-brandometer-powering-brand-perception-measurement-on-social-media-data-with-ai-c83019408051</guid>
      <pubDate>Fri, 26 Apr 2024 18:01:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Introducing Trio | Part III]]></title>
      <description><![CDATA[<h4>Part three on how we built a Compose based architecture with Mavericks in the Airbnb Android app</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*f5gDJHZdQ6lSewSKhZya-A.jpeg"></figure><p>By: <a href="https://www.linkedin.com/in/eli-hart-54a4b975/">Eli Hart</a>, <a href="https://www.linkedin.com/in/schwabben/">Ben Schwab</a>, and <a href="https://www.linkedin.com/in/yvonnejwong">Yvonne Wong</a></p><p>Trio is Airbnb’s framework for Jetpack Compose screen architecture in Android. It’s built on top of <a href="https://github.com/airbnb/mavericks">Mavericks</a>, Airbnb’s open source state management library for Jetpack. In this blog post series, we’ve been breaking down how Trio works to help explain our design decisions, in the hopes that other teams might benefit from aspects of our approach.</p><p>We recommend starting with <a href="https://medium.com/p/7f5017a1a903">Part 1</a>, about Trio’s architecture, and then reading <a href="https://medium.com/p/fe836013a798">Part 2</a>, about how navigation works in Trio, before you dive into this article. In this third and final part of our series, we’ll discuss how Props in Trio allow for simplified, type-safe communication between ViewModels. We’ll also share an update on the current adoption of Trio at Airbnb and what’s next.</p><h4>Trio Props</h4><p>To better understand Props, let’s look at an example of a simple Message Inbox screen, composed of two Trios side by side. There is a List Trio on the left, showing inbox messages, and a Details Trio on the right, showing the full text of a selected message.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*NNhLx3vCzmjE2wOSjClQrw.png"></figure><p>The two Trios are wrapped by a parent screen, which is responsible for instantiating the two children, passing along data to them, and positioning them in the UI. As you might recall from Part 2, Trios can be stored in State; the parent’s State includes both the message data as well as the child Trios.</p><pre>data class ParentState(<br>  val inboxMessages: List&lt;Message&gt;,<br>  val selectedMessage: Message?,<br>  val messageListScreen: Trio&lt;ListProps&gt;,<br>  val messageDetailScreen: Trio&lt;DetailsProps&gt;,<br>} : MavericksState</pre><p>The parent’s UI decides how to display the children, which it accesses from the State. With Compose UI, it’s easy to apply custom layout logic: we show the screens side by side when the device is in landscape mode, and in portrait we show only a single screen, depending on whether a message has been selected.</p><pre>@Composable <br>override fun TrioRenderScope.Content(state: ParentState) {<br>  if (LocalConfiguration.current.orientation == ORIENTATION_LANDSCAPE) {<br>    Row(Modifier.fillMaxSize()) {<br>      ShowTrio(state.listScreen, modifier = Modifier.weight(1f))<br>      ShowTrio(state.detailScreen)<br>    }<br>  } else {<br>    if (state.selectedMessage == null) {<br>      ShowTrio(state.listScreen)<br>    } else {<br>      BackHandler { viewModel.clearMessageSelection() }<br>      ShowTrio(state.detailScreen)<br>    }<br>  }<br>}</pre><p>Both child screens need access to the latest message state so they know which content to show. We can provide this with Props!</p><p>Props are a collection of Kotlin properties, held in a data class and passed to a Trio by its parent.</p><p>Unlike Arguments, Props can change over time, allowing a parent to provide updated data as needed throughout the lifetime of the Trio. Props can include Lambda expressions, allowing a screen to communicate back to its parent.</p><p>A child Trio can only be shown in a parent that supports its Props type. This ensures compile-time correctness for navigation and communication between Trios.</p><h4>Defining Props</h4><p>Let’s see how Props are used to pass message data from the parent Trio to the List and Details Trios. When a parent defines child Trios in its State, it must include the type of Props that those children require. For our example, the List and Details screen each have their own unique Props.</p><p>The List screen needs to know the list of all Messages and whether one is selected. It also needs to be able to call back to the parent to tell it when a new message has been selected.</p><pre>data class ListProps(<br>  val selectedMessage: Message?,<br>  val inboxMessages: List&lt;Message&gt;,<br>  val onMessageSelected: (Message) -&gt; Unit,<br>)</pre><p>The Details screen just needs to know which message to display.</p><pre>data class DetailProps(<br>  val selectedMessage: Message?<br>)</pre><p>The parent ViewModel holds the child instances in its State, and is responsible for passing the latest Props value to the children.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*aZuD30OJ_eCdVrQy"></figure><h4>Passing Props</h4><p>So, how does a parent Trio pass Props to its child? In its init block it must use the launchChildInitializer function — this function uses a lambda to select a Trio instance from the State, specifying which Trio is being targeted.</p><pre>class ParentViewModel: TrioViewModel {<br><br>  init {<br>    launchChildInitializer({ messageListScreen }) { state -&gt;<br>      ListProps(<br>        state.selectedMessage,<br>        state.inboxMessages,<br>        ::showMessageDetails<br>      )<br>    }<br><br>   launchChildInitializer({ detailScreen }) { state -&gt;<br>      DetailProps(state.selectedMessage)<br>   }<br>  }<br><br>  fun showMessageDetails(message: Message?) ...<br>}</pre><p>The second lambda argument receives a State value and returns a new Props instance to pass to the child. This function manages the lifecycle of the child, initializing it with a flow of Props when it is first created, and destroying it if it is ever removed from the parent’s state.</p><p>The lambda to rebuild Props is re-invoked every time the Parent’s state changes, and any new value of Props is passed along to the child through its flow.</p><p>A common pattern we use is to include function references in the Props, which point to functions on the parent ViewModel. This allows the child to call back to the parent for event handling. In the example above we do this with the showMessageDetails function. Props can also be used to pass along complex dependencies, which forms a dependency graph scoped to the parent.</p><p>Note that we cannot pass Props to a Trio when it is created, like we do with Args. This is because Trios must be able to be restored after process death, and so the Trio class, as well as the Args used to create it, are Parcelable. Since Props can contain lambdas and other arbitrary objects that cannot be safely serialized, we must use the above pattern to establish a flow of Props from parent to child that can be reestablished even after process recreation. Navigation and inter-screen communication would be a lot simpler if we didn’t have to handle process recreation!</p><h4>Using Props</h4><p>In order for a child Trio to use Props data in its UI, it first needs to be copied to State.</p><p>Child ViewModels override the function updateStateFromPropsChange to specify how to incorporate Prop values into State. The function is invoked every time the value of Props changes, and the new State value is updated on the ViewModel. This is how children stay up-to-date with the latest data from their parent.</p><pre>class ListViewModel : TrioViewModel&lt;ListProps, ListState&gt; {<br><br>  override fun updateStateFromPropsChange(<br>    newProps: ListProps,<br>    thisState: ListState<br>  ): ListState {<br>    return thisState.copy(<br>      inboxMessages = newProps.inboxMessages,<br>      selectedMessage = newProps.selectedMessage<br>    )<br>  }<br><br>  fun onMessageSelected(message: Message) {<br>    props.onMessageSelected(message)<br>  }<br>}</pre><p>For non-state values in Props, such as dependencies or callbacks, the ViewModel can access the latest Props value at any time via the props property. For example, we do this in the onMessageSelected function in the sample code above. The List UI will invoke this function when a message is selected, and the event will be propagated to the parent through Props.</p><p>There were a lot of complexities when implementing Props — for example, when handling edge cases around the Trio lifecycle and restoring state after process death. However, the internals of Trio hide most of the complexity from the end user. Overall, having an opinionated, codified system with type safety for how Compose screens communicate has helped improve standardization and productivity across our Android engineering team.</p><h3>Standardizing Screen Flow Props</h3><p>One of the most common UI patterns at Airbnb is to coordinate a stack of screens. These screens may share some common data, and follow similar navigation patterns such as pushing, popping, and removing all the screens of the backstack in tandem.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*_BV0EOOASuaeXtECVQJ3sA.gif"></figure><p>Earlier, we showed how a Trio can manage a list of children in its State to accomplish this, but it’s tedious to do that manually. To help, Trio provides a standard “screen flow” implementation, which consists of a parent ScreenFlow Trio and related child Trio screens. The parent ScreenFlow automatically manages child transactions, and renders the top child in its UI. It also broadcasts a custom Props class to its children, giving access to shared state and navigation functions.</p><p>Consider building a Todo app that has a TodoList screen, a TaskScreen, and an EditTaskScreen. These screens can all share a single network request that returns a TodoList model. In Trio terms, the TodoList data model could be the Props for these three screens.</p><p>To manage these screens we use ScreenFlow infrastructure to create a TodoScreenFlow Trio. Its state extends ScreenFlowState and overrides a childScreenTransaction property to hold the transactions. In this example, the flow’s State was initialized to start with the TodoListScreen, so it will be rendered first. The flow’s State object also acts as the source of truth for other shared state, such as the TodoList data model.</p><pre>data class TodoFlowState(<br>  @PersistState<br>  override val childScreenTransactions: List&lt;ScreenTransaction&lt;TodoFlowProps&gt;&gt; = listOf(<br>    ScreenTransaction(Router.TodoListScreen.createFullPaneTrio(NoArgs))<br>  ),<br>  // shared state<br>  val todoListQuery: TodoList?,<br>) : ScreenFlowState&lt;TodoFlowState, TodoFlowProps&gt;</pre><p>This state is private to the TodoScreenFlow. However, the flow defines Props to share the TodoList data model, callbacks like a reloadList lambda, and a NavController with its children.</p><pre>data class TodoFlowProps(<br>  val navController: NavController&lt;TodoFlowProps&gt;,<br>  val todoListQuery: TodoList?,<br>  val reloadList: () -&gt; Unit,<br>)</pre><p>The NavController prop can be used by the children screens to push another sibling screen in the flow. The ScreenFlowViewModel base class implements this NavController interface, managing the complexity of integrating the navigation actions into the screen flow’s state.</p><pre>interface NavController&lt;PropsT&gt;(<br>   fun push(router: TrioRouter&lt;*, in PropsT&gt;)<br>   fun pop()<br>)</pre><p>Lastly, the navigation and shared state is wired into a flow of Props when the TodoScreenFlowViewModel overrides createFlowProps. This function will be invoked anytime the internal state of TodoScreenFlowViewModel changes, meaning any update to TodoList model will be propagated to the children screens.</p><pre>class TodoScreenFlowViewModel(<br>  initializer: Initializer&lt;NavPopProps, TodoFlowState&gt;<br>) : ScreenFlowViewModel&lt;NavPopProps, TodoFlowProps, TodoFlowState&gt;(initializer) {<br><br>  override fun createFlowProps(<br>    state: TodoFlowState,<br>    props: NavPopProps<br>  ): TodoFlowProps {<br>    return TodoFlowProps(<br>      navController = this,<br>      state.todoListQuery,<br>      ::reloadList,<br>    )<br>  }<br>}</pre><p>Inside one of the children screen’s ViewModels, we can see that it will receive the shared Props:</p><pre>class TodoListViewModel(<br>  initializer: Initializer&lt;TodoFlowProps, TodoListState&gt;<br>) : TrioViewModel&lt;TodoFlowProps, TodoListState&gt;(initializer) {<br><br>  override fun updateStateFromPropsChange(<br>    newProps: TodoFlowProps, <br>    thisState: TodoTaskState<br>  ): TodoTaskState {<br>      // Incorporate the shared data model into this Trio’s private state passed to its UI:<br>      return thisState.copy(todoListQuery = newProps.todoListQuery)<br>  }<br><br>  fun navigateToTodoTask(task: TodoTask) {<br>    this.props.navController.push(Router.TodoTaskScreen, TodoTaskArgs(task.id))<br>  }<br>}</pre><p>In navigateToTodoTask, the NavController prepared by the flow parent is used to safely navigate to the next screen in the flow (guaranteeing it will receive the shared TodoFlowProps). Internally, the NavController updates the ScreenFlow’s childScreenTransactions triggering the ScreenFlow infra to provide the shared TodoFlowProps to the new screen, and render the new screen.</p><h3>Trio’s Success at Airbnb</h3><h4>Development history and launch</h4><p>We started designing Trio in late 2021, with the first Trio screens seeing production traffic in mid 2022.</p><p>As of March 2024, we now have over 230 Trio screens with significant production traffic at Airbnb.</p><p>From surveying our developers, we’ve heard that many of them enjoy the overall Trio experience; they like having clear and opinionated patterns and are happy to be in a pure Compose environment. As one developer put it, “Props was a huge plus by allowing multiple screens to share callbacks, which simplified some of my code logic a lot.” Another said, “Trio makes you unlearn bad habits and adopt best practices that work for Airbnb based on our past learnings.” Overall, our team reports faster development cycles and cleaner code. “It makes Android development faster and more enjoyable,” is how one engineer summed it up.</p><h4>Dev Tooling</h4><p>To support our engineers, we have invested in IDE tooling with an in-house Android Studio Plugin. It includes a Trio Generation tool that creates all of the files and boilerplate for a new Trio, including routing, mocks, and tests.</p><p>The tool helps the user choose which Arguments and Props to use, and helps with other customization such as setting up custom Flows. It also allows us to embed educational experiences to help newcomers ramp up with Trio.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*8JM1jvE7OXP0HfgD"></figure><p>One piece of feedback we heard from engineers was that it was tedious to change a Trio’s Args or Props types, since they are used across many different files.</p><p>We leveraged our IDE plugin to provide a tool to automatically change these values, making this workflow much faster.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*DhCoCfO6GE1w_Idv"></figure><p>Our team leans heavily on tooling like this, and we’ve found it to be very effective in improving the experience of engineers at Airbnb. We’ve adopted Compose Multiplatform for our Plugin UI development which we believe made building powerful developer tooling more feasible and enjoyable.</p><h3>Conclusion</h3><p>Overall, with more than 230 of our production screens implemented as Trios, Trio’s organic adoption at Airbnb has proven that many of our bets and design choices were worth the tradeoffs.</p><p>One change we are anticipating, though, is to incorporate shared element transitions between screens once the Compose framework provides APIs to support that functionality. When Compose APIs for this are available, we’ll likely have to redesign our navigation APIs accordingly.</p><p>Thanks for following along with the work we’ve been doing at Airbnb. Our Android Platform team works on a variety of complex and interesting projects like Trio, and we’re excited to share more in the future.</p><p>If this kind of work sounds appealing to you, check out our <a href="https://careers.airbnb.com/">open roles</a> — we’re hiring!</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=033fbfe2171b" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/introducing-trio-part-iii-033fbfe2171b">Introducing Trio | Part III</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/introducing-trio-part-iii-033fbfe2171b</link>
      <guid>https://medium.com/airbnb-engineering/introducing-trio-part-iii-033fbfe2171b</guid>
      <pubDate>Thu, 11 Apr 2024 19:33:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Chronon, Airbnb’s ML Feature Platform, Is Now Open Source]]></title>
      <description><![CDATA[<p>A feature platform that offers observability and management tools, allows ML practitioners to use a variety of data sources, while handling the complexity of data engineering, and provides low latency streaming.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*KNHSmM_Zx6RuR8XDNSuEwA.jpeg"></figure><p>By: <a href="https://www.linkedin.com/in/vzanoyan/">Varant Zanoyan</a>, <a href="https://www.linkedin.com/in/nikhilsimha/">Nikhil Simha Raprolu</a></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*djCwPhaLqp--OMPw"><figcaption><em>Chronon allows ML practitioners to use a variety of data sources as inputs to feature transformations. It handles the complexity of data plumbing, such as batch and streaming compute, provides low latency serving, and offers a host of observability and management tools.</em></figcaption></figure><p><strong>Airbnb is happy to announce that </strong><a href="https://www.chronon.ai/"><strong>Chronon</strong></a><strong>, our ML Feature Platform, is now open source.</strong></p><p><strong>We’re excited to be making this announcement along with our partners at Stripe, who are early adopters and co-maintainers of the project.</strong></p><p>This blog post covers the main motivation and functionality of Chronon. For an overview of core concepts in Chronon, please see <a href="https://medium.com/airbnb-engineering/chronon-a-declarative-feature-engineering-framework-b7b8ce796e04">this previous post</a>.</p><h3>Background</h3><p>We built Chronon to relieve a common pain point for ML practitioners: they were spending the majority of their time managing the data that powers their models rather than on modeling itself.</p><p>Prior to Chronon, practitioners would use one of the following two approaches:</p><ol><li><strong>Replicate offline-online:</strong> ML practitioners train the model with data from the data warehouse, then figure out ways to replicate those features in the online environment. The benefit of this approach is that it allows practitioners to utilize the full data warehouse, both the data sources and powerful tools for large-scale data transformation. The downside is that this leaves no clear way to serve model features for online inference, resulting in inconsistencies and label leakage that severely affect model performance.</li><li><strong>Log and wait:</strong> ML practitioners start with the data that is available in the online serving environment from which the model inference will run. They log relevant features to the data warehouse. Once enough data has accumulated, they train the model on the logs, and serve with the same data. The benefit of this approach is that consistency is guaranteed and leakage is unlikely. However the major drawback is that it can result in long wait times, hindering the ability to respond quickly to changing user behavior.</li></ol><p>The Chronon approach allows for the best of both worlds. Chronon requires ML practitioners to define their features only once, powering both offline flows for model training as well as online flows for model inference. Additionally, Chronon offers powerful tooling for feature chaining, observability and data quality, and feature sharing and management.</p><h3>How It Works</h3><p>Below we explore the main components that power most of Chronon’s functionality using a simple example derived from the <a href="https://chronon.ai/getting_started/Tutorial.html">quickstart guide</a>. You can follow that guide to run this example.</p><p>Let’s assume that we’re a large online retailer, and we’ve detected a fraud vector based on users making purchases and later returning items. We want to train a model to predict whether a given transaction is likely to result in a fraudulent return. We will call this model each time a user starts the checkout flow.</p><h4>Defining Features</h4><p><strong>Purchases Data:</strong> We can aggregate the purchases log data to the user level to give us a view into this user’s previous activity on our platform. Specifically, we can compute SUMs, COUNTs and AVERAGEs of their previous purchase amounts over various time windows.</p><pre>source = Source(<br>    events=EventSource(<br>        table="data.purchases", # This points to the log table in the warehouse with historical purchase events, updated in batch daily<br>        topic="events/purchases", # The streaming source topic<br>        query=Query(<br>            selects=select("user_id","purchase_price"), # Select the fields we care about<br>            time_column="ts") # The event time<br>    ))<br><br>window_sizes = [Window(length=day, timeUnit=TimeUnit.DAYS) for day in [3, 14, 30]] # Define some window sizes to use below<br><br>v1 = GroupBy(<br>    sources=[source],<br>    keys=["user_id"], # We are aggregating by user<br>    online=True,<br>    aggregations=[Aggregation(<br>            input_column="purchase_price",<br>            operation=Operation.SUM,<br>            windows=window_sizes<br>        ), # The sum of purchases prices in various windows<br>        Aggregation(<br>            input_column="purchase_price",<br>            operation=Operation.COUNT,<br>            windows=window_sizes<br>        ), # The count of purchases in various windows<br>        Aggregation(<br>            input_column="purchase_price",<br>            operation=Operation.AVERAGE,<br>            windows=window_sizes<br>        ), # The average purchases by user in various windows<br>        Aggregation(<br>            input_column="purchase_price",<br>            operation=Operation.LAST_K(10),<br>        ), # The last 10 purchase prices aggregated as a list<br>    ],<br>)</pre><p><em>This creates a `GroupBy` which transforms the `purchases` event data into useful features by aggregating various fields over various time windows, with `user_id` as a primary key.</em></p><p>This transforms raw purchases log data into useful features at the user level.</p><p><strong>User Data:</strong> Turning User data into features is a littler simpler, primarily because we don’t have to worry about performing aggregations. In this case, the primary key of the source data is the same as the primary key of the feature, so we can simply extract column values rather than perform aggregations over rows:</p><pre>source = Source(<br>    entities=EntitySource(<br>        snapshotTable="data.users", # This points to a table that contains daily snapshots of all users<br>        query=Query(<br>            selects=select("user_id","account_created_ds","email_verified"), # Select the fields we care about<br>        )<br>    ))<br><br>v1 = GroupBy(<br>    sources=[source],<br>    keys=["user_id"], # Primary key is the same as the primary key for the source table<br>    aggregations=None, # In this case, there are no aggregations or windows to define<br>    online=True,<br>) </pre><p><em>This creates a `GroupBy` which extracts dimensions from the `data.users` table for use as features, with `user_id` as a primary key.</em></p><p><strong>Joining these features together: </strong>Next, we need to combine the previously defined features into a single view that can be both backfilled for model training and served online as a complete vector for model inference. We can achieve this using the Join API.</p><p>For our use case, it’s very important that features are computed as of the correct timestamp. Because our model runs when the checkout flow begins, we want to use the corresponding timestamp in our backfill, such that feature values for model training logically match what the model will see in online inference.</p><p>Here’s what the definition would look like. Note that it combines our previously defined features in the right_parts portion of the API (along with another feature set called returns).</p><pre><br>source = Source(<br>    events=EventSource(<br>        table="data.checkouts", <br>        query=Query(<br>            selects=select("user_id"), # The primary key used to join various GroupBys together<br>            time_column="ts",<br>            ) # The event time used to compute feature values as-of<br>    ))<br><br>v1 = Join(  <br>    left=source,<br>    right_parts=[JoinPart(group_by=group_by) for group_by in [purchases_v1, returns_v1, users]] # Include the three GroupBys<br>)</pre><h3>Backfills/Offline Computation</h3><p>The first thing that a user would likely do with the above Join definition is run a backfill with it to produce historical feature values for model training. Chronon performs this backfill with a few key benefits:</p><ol><li><strong>Point-in-time accuracy:</strong> Notice the source that is used as the “left” side of the join above. It is built on top of the “data.checkouts” source, which includes a “ts” timestamp on each row that corresponds to the logical time of that particular checkout. Every feature computation is guaranteed to be window-accurate as of that timestamp. So for the one-month sum of previous user purchases, every row will be computed for the user as of the timestamp provided by the left-hand source.</li><li><strong>Skew handling:</strong> Chronon’s backfill algorithms are optimized for handling highly skewed datasets, avoiding frustrating OOMs and hanging jobs.</li><li><strong>Computational efficiency optimizations:</strong> Chronon is able to bake in a number of optimizations directly into the backend, reducing compute time and cost.</li></ol><h3>Online Computation</h3><p>Chronon abstracts away a lot of complexity for online feature computation. In the above examples, it would compute features based on whether the feature is a batch feature or a streaming feature.</p><p><strong>Batch features (for example, the User features above)</strong></p><p>Because the User features are built on top of a batch table, Chronon will simply run a daily batch job to compute the new feature values as new data lands in the batch data store and upload them to the online KV store for serving.</p><p><strong>Streaming features (for example, the Purchases features above)</strong></p><p>The Purchases features are built on a source that includes a streaming component, as indicated by the inclusion of a “topic” in the source. In this case, Chronon will still run a batch upload in addition to a streaming job for real time updates. The batch jobs is responsible for:</p><ol><li><strong>Seeding the values:</strong> For long windows, it wouldn’t be practical to rewind the stream and play back all raw events.</li><li><strong>Compressing “the middle of the window” and providing tail accuracy: </strong>For precise window accuracy, we need raw events at both the head and the tail of the window.</li></ol><p>The streaming job then writes updates to the KV store to keep feature values up to date at fetch time.</p><h3>Online Serving / Fetch API</h3><p>Chronon offers an API to fetch features with low latency. We can either fetch values for individual GroupBys (i.e. the Users or Purchases features defined above) or for a Join. Here’s an example of what one such request and response for a Join would look like:</p><pre>// Fetching all features for user=123<br>Map&lt;String, String&gt; keyMap = new HashMap&lt;&gt;();<br>keyMap.put("user", "123")<br>Fetcher.fetch_join(new Request("quickstart_training_set_v1", keyMap));<br>// Sample response (map of feature name to value)<br>'{"purchase_price_avg_3d":14.2341, "purchase_price_avg_14d":11.89352, ...}'</pre><p><em>Java code that fetches all features for user 123. The return type is a map of feature name to feature value.</em></p><p>The above example uses the Java client. There is also a Scala client and a Python CLI tool for easy testing and debugging:</p><pre>run.py --mode=fetch -k '{"user_id":123}' -n quickstart/training_set -t join<br><br>&gt; {"purchase_price_avg_3d":14.2341, "purchase_price_avg_14d":11.89352, ...}</pre><p><em>Utilizes the run.py CLI tool to make the same fetch request as the Java code above. run.py is a convenient way to quickly test Chronon workflows like fetching.</em></p><p>Another option is to wrap these APIs into a service and make requests via a REST endpoint. This approach is used within Airbnb for fetching features in non-Java environments such as Ruby.</p><h3>Online-Offline Consistency</h3><p>Chronon not only helps online-offline accuracy, it also offers a way to measure it. The measurement pipeline starts with the logs of the online fetch requests. These logs include the primary keys and timestamp of the request, along with the fetched feature values. Chronon then passes the keys and timestamps to a Join backfill as the left side, asking the compute engine to backfill the feature values. It then compares the backfilled values to actual fetched values to measure consistency.</p><h3>What’s Next?</h3><p>Open source is just the first step in an exciting journey that we look forward to taking with our partners at Stripe and the broader community.</p><p>Our vision is to create a platform that enables ML practitioners to make the best possible decisions about how to leverage their data and makes enacting those decisions as easy as possible. Here are some questions that we’re currently using to inform our roadmap:</p><p><strong>How much further can we lower the cost of iteration and computation?</strong></p><p>Chronon is already built for the scale of data processed by large companies such as Airbnb and Stripe. However, there are always further optimizations that we can make to our compute engine, both to reduce the compute cost and the “time cost” of creating and experimenting with new features.</p><p><strong>How much easier can we make authoring a new feature?</strong></p><p>Feature engineering is the process by which humans express their domain knowledge to create signals that the model can leverage. Chronon could integrate NLP to allow ML practitioners to express these feature ideas in natural language and generate working feature definition code as a starting point for their iteration.</p><p>Lowering the technical bar to feature creation would in turn open the door to new kinds of collaboration between ML practitioners and partners who have valuable domain expertise.</p><p><strong>Can we improve the way models are maintained?</strong></p><p>Changing user behavior can cause shifts in model performance because the data that the model was trained on no longer applies to the current situation. We imagine a platform that can detect these shifts and create a strategy to address them early and proactively, either by retraining, adding new features, modifying existing features, or some combination of the above.</p><p><strong>Can the platform itself become an intelligent agent that helps ML practitioners build and deploy the best possible models?</strong></p><p>The more metadata that we gather into the platform layer, the more powerful it can become as a general ML assistant.</p><p>We mentioned the goal of creating a platform that can automatically run experiments with new data to identify ways to improve models. Such a platform might also help with data management by allowing ML practitioners to ask questions such as “What kinds of features tend to be most useful when modeling this use case?” or “What data sources might help me create features that capture signal about this target?” A platform that could answer these types of questions represents the next level of intelligent automation.</p><h3>Getting Started</h3><p>Here are some resources to help you get started or to evaluate if Chronon is a good fit for your team.</p><ul><li>Check out the <a href="https://github.com/airbnb/chronon">project on Github</a>, the <a href="https://www.chronon.ai/index.html">Chronon website</a>, and especially the <a href="https://www.chronon.ai/getting_started/Tutorial.html">quickstart guide</a>.</li><li>Drop into our <a href="https://discord.gg/GbmGATNqqP">community Discord channel</a>. The Airbnb and Stripe teams are excited to chat with you about how Chronon might fit into your stack.</li></ul><p>Interested in this type of work? Check out our open roles <a href="https://careers.airbnb.com/">here</a> — we’re hiring.</p><h3>Acknowledgements</h3><p><strong>Sponsors:</strong> <a href="mailto:henry.saputra@airbnb.com">Henry Saputra</a> <a href="mailto:yi.li@airbnb.com">Yi Li</a> <a href="mailto:jack.song@airbnb.com">Jack Song</a></p><p><strong>Contributors:</strong> <a href="mailto:pengyu.hou@airbnb.com">Pengyu Hou</a> <a href="mailto:cristian.figueroa@airbnb.com">Cristian Figueroa</a> <a href="mailto:haozhen.ding@airbnb.com">Haozhen Ding</a> <a href="mailto:sophie.wang@airbnb.com">Sophie Wang</a> <a href="mailto:vamsee.y@airbnb.com">Vamsee Yarlagadda</a> <a href="mailto:haichun.chen@airbnb.com">Haichun Chen</a> <a href="mailto:donghan.zhang@airbnb.com">Donghan Zhang</a> <a href="mailto:hao.cen@airbnb.com">Hao Cen</a> <a href="mailto:yuli.han@airbnb.com">Yuli Han</a> <a href="mailto:evgeny.shapiro@airbnb.com">Evgenii Shapiro</a> Patrick Yoon</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=d9c4dba859e8" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/chronon-airbnbs-ml-feature-platform-is-now-open-source-d9c4dba859e8">Chronon, Airbnb’s ML Feature Platform, Is Now Open Source</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/chronon-airbnbs-ml-feature-platform-is-now-open-source-d9c4dba859e8</link>
      <guid>https://medium.com/airbnb-engineering/chronon-airbnbs-ml-feature-platform-is-now-open-source-d9c4dba859e8</guid>
      <pubDate>Mon, 08 Apr 2024 19:18:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Introducing Trio | Part II]]></title>
      <description><![CDATA[<h4>Part two on how we built a Compose based architecture with Mavericks in the Airbnb Android app</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/862/1*Ill4nQ2nbUwqQakROhb-6Q.jpeg"></figure><p>By: <a href="https://www.linkedin.com/in/eli-hart-54a4b975/">Eli Hart</a>, <a href="https://www.linkedin.com/in/schwabben/">Ben Schwab</a>, and <a href="https://www.linkedin.com/in/yvonnejwong">Yvonne Wong</a></p><p>In the previous post in this series, we introduced you to Trio, Airbnb’s framework for Jetpack Compose screen architecture in Android. Some of the advantages of Trio include:</p><ul><li>Guarantees type safety when communicating across module boundaries in complex apps</li><li>Codifies expectations about how ViewModels are used and shared, and what interfaces look like between screens</li><li>Allows for stable screenshot and UI tests and simple navigation testing</li><li>Compatible with <a href="https://github.com/airbnb/mavericks">Mavericks</a>, Airbnb’s open source state management library for Jetpack (Trio is built on top of Mavericks)</li></ul><p>If you need a refresher on Trio or are learning about this framework for the first time, start with <a href="https://medium.com/p/7f5017a1a903">Part 1</a>. It provides an overview of why we built Trio when transitioning to Compose from a Fragments-based architecture. Part 1 also explains the core framework concepts like the Trio class and UI class.</p><p>In this post, we’ll build upon what we’ve shared so far and dive into how navigation works in Trio. As you’ll see, we designed Trio to make navigation simpler and easier to test, especially for large, modularized applications.</p><h4>Navigating with Trio</h4><p>A unique approach in our design is that Trios are stored in the ViewModel’s State, right alongside all other data that a Screen exposes to the UI. For example, a common use case is to store a list of Trios to represent a stack of screens.</p><pre>data class ParentState(<br>  @PersistState val trioStack: List&lt;Trio&gt;<br>) : MavericksState</pre><p>The PersistState annotation is a mechanism of Mavericks that automatically saves and restores parcelable State values across process death, so the navigation state is preserved. A compile time validation ensures that Trio values in State classes are annotated like this so that their state is always saved correctly.</p><p>The ViewModel controls this state, and can expose functions to push a new screen or pop off a screen. Since the ViewModel has direct control over the list of Trios, it can also easily perform more complex navigation changes such as reordering screens, dropping multiple screens, or clearing all screens. This makes navigation extremely flexible.</p><pre>class ParentViewModel : TrioViewModel {<br>  fun pushScreen(trio: Trio) = setState {<br>    copy(trioStack = trioStack + trio)<br>  }<br><br>  fun pop() = setState {<br>    copy(trioStack = trioStack.dropLast(1))<br>  }<br>}</pre><p>The Parent Trio’s UI accesses the Trio list from State and chooses how and where to place the Trios. We can implement a screen flow by showing the latest Trio in the stack.</p><pre>@Composable<br>override fun TrioRenderScope.Content(state: ParentState) {<br>  ShowTrio(state.trioStack.last())<br>}</pre><h4>Coordinating Navigation</h4><p>Why store Trios in State? Alternative approaches might use a navigator object in the Compose UI. However, representing the application’s navigation graph in State allows the ViewModel to update its data and navigation in a single place. This can be extremely helpful when we need to delay making a navigation change until after an asynchronous action, like a network request, completes. We could not do this easily with Fragments and found that with Trio’s approach, our navigation becomes simpler, more explicit, and more easily testable.</p><p>This example shows how the ViewModel can handle a “save and exit” call from the UI by launching a suspending network request in a coroutine. Once the request completes, we can pop the screen by updating the Trio stack in State. We can also atomically modify other values in the state at the same time, perhaps based on the result of the network request. This easily guarantees that navigation and ViewModel state stay in sync.</p><pre>class CounterViewModel : TrioViewModel {<br><br>  fun saveAndExit() = viewModelScope.launch {<br>    val success = performSaveRequest()<br><br>    setState {<br>      copy(<br>        trioStack = trioStack.dropLast(1),<br>        success = success<br>      )<br>    }<br>  }<br>}</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*2w6-_2Ze5jXKTaR9"></figure><p>As the navigation stack becomes more complex, application UI hierarchy gets modeled by a chain of ViewModels and their States. As the state is rendered, it creates a corresponding Compose UI hierarchy.</p><p>A Trio can represent an arbitrary UI element of any size, including nested screens and sections, while providing a backing state and a mechanism to communicate with other Trios in the hierarchy.</p><p>There are two additional nice benefits of modeling the hierarchy in ViewModel state like this. One is that it becomes simple to specify custom navigation scenarios when setting up testing — we can easily create whatever navigation states we want for our tests.</p><p>Another benefit is that since the navigation hierarchy is decoupled from the Compose UI, we can pre-load Trios that we anticipate needing, just by initializing their ViewModels ahead of time. This has made it significantly simpler for us to optimize performance through preloading screens.</p><p>Mavericks State typically holds simple data classes, and not complex objects like a Trio, which have a lifecycle. However, we find that the benefits this approach brings are well worth the extra complexity.</p><h4>Managing Activities</h4><p>Ideally, an application with Trio would use just a single activity, following the standard <a href="https://developer.android.com/topic/architecture/recommendations">application architecture recommendation</a> from Google. However, especially for interop purposes, Trios will sometimes need to start new activity intents. Traditionally, this isn’t done from a ViewModel because ViewModels <a href="https://developer.android.com/topic/libraries/architecture/viewmodel#:~:text=A%20ViewModel%20usually%20shouldn%27t%20reference">should not contain Activity references</a>, since they outlive the Activity lifecycle; however, in order to maintain our paradigm of doing all navigation in the ViewModel, Trio makes an exception.</p><p>During initialization, the Trio ViewModel is given a Flow of Activity via its initializer. This Flow provides the current activity that the ViewModel is attached to, and null when it is detached, such as during activity recreation. Trio internals manage the Flow to guarantee that it is up to date and the activity is not leaked.</p><p>When needed, a ViewModel can access the next non-null activity value via the awaitActivity suspend function. For example, we can use it to start a new activity after a network request completes.</p><pre>class ViewModelInitializer&lt;S : MavericksState&gt;(<br>  val initialState: S,<br>  internal val activityFlow: Flow&lt;Activity?&gt;,<br>  ...<br>)<br><br>class CounterViewModel(<br>  initializer: ViewModelInitializer<br>) : TrioViewModel {<br><br>  fun saveAndOpenNextPage() = viewModelScope.launch {<br>    performSaveRequest()<br>    awaitActivity().startActivity()<br>  }<br>}</pre><p>The awaitActivity function is provided by the TrioViewModel as a convenient way to get the next value in the activity flow.</p><pre>suspend fun awaitActivity(): ComponentActivity {<br>  return initializer.activityFlow.filterNotNull().first()<br>}</pre><p>While a bit unorthodox, this pattern allows activity-based navigation to also be collocated with other business logic in the ViewModel.</p><h4>Modularization Structure</h4><p>Properly modularizing a large code base is a problem that many applications face. At Airbnb, we’ve split our codebase into over 2000 modules to allow faster build speeds and explicit ownership boundaries. To support this, we’ve built an in house navigation system that decouples feature modules. It was originally created to support Fragments and Activities, and was later expanded to integrate with Trio, helping us to solve the general problem of navigation at scale in a large application.</p><p>In our project structure, each module has a specific type, indicated by its prefix and suffix, which defines its purpose and enforces a set of rules about which other modules it can depend on.</p><p>Feature modules, prefixed with “feat”, contain our Trio screens; each screen in the app might live in its own separate module. To prevent circular dependencies and improve build speeds, we do not allow feature modules to depend on each other.</p><p>This means that one feature cannot directly instantiate another. Instead, each feature module has a corresponding navigation module, suffixed with “nav”, which defines a router to its feature. To avoid a circular dependency, the router and its destination Trio are associated with Dagger multibinding.</p><p>In this simple example, we have a counter feature and a decimal feature. The counter feature can open the decimal feature to modify the decimal count, so the counter module needs to depend on the decimal navigation module.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*kDpH6Vi7Y6HvFkyQ"></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/300/1*yTEItzOwdvBn3soU3phGzg.gif"></figure><h4>Routing</h4><p>The navigation module is small. It contains only a Routers class with nested Router objects corresponding to each Trio in the feature module.</p><pre>// In feat.decimal.nav<br>@Plugin(pluginPoint = RoutersPluginPoint::class)<br>class DecimalRouters : RouterDeclarations() {<br><br>  @Parcelize<br>  data class DecimalArgs(val count: Double) : Parcelable<br><br>  object DecimalScreen <br>    : TrioRouter&lt;DecimalArgs, NavigationProps, NoResult&gt;<br>}</pre><p>A Router object is parameterized with the types that define the Trio’s public interface: the Arguments to instantiate it, the Props that it uses for active communication, and if desired, the Result that the Trio returns.</p><p>Arguments is a data class, often including primitive data indicating starting values for a screen.</p><p>Importantly, the Routers class is annotated with @Plugin to declare that it should be added to the Routers PluginPoint. This annotation is part of an internal KSP processor that we use for dependency injection, but it essentially just generates the boilerplate code to set up a Dagger multibinding set. The result is that each Routers class is added to a set, which we can access from the Dagger graph at runtime.</p><p>On the corresponding Trio class in the feature module, we use the @TrioRouter annotation to specify which Router the Trio maps to. Our KSP processor matches these at compile time, and generates code that we can use at runtime to find the Trio destination for each Router.</p><pre>// In feat.decimal<br>@TrioRouter(DecimalRouters.DecimalScreen::class)<br>class DecimalScreen(<br>  initializer: Initializer&lt;DecimalArgs, ...&gt;<br>) : Trio&lt;DecimalArgs, NavigationProps, ...&gt;</pre><p>The processor validates at compile time that the Arguments and Props on the Router match the types on the Trio, and that each Router has a single corresponding destination. This guarantees runtime type safety in our navigation system.</p><h4>Router Usage</h4><p>Instead of manually instantiating Trios, we let the Router do it for us. The Router ensures that the proper type of Arguments is provided, looks up the matching Trio class in the Dagger graph, creates the initializer class to wrap the arguments, and finally, uses reflection to invoke the Trio’s constructor.</p><p>This functionality is accessible through a createTrio function on the router, which we can invoke from the ViewModel. This allows us to easily create a new instance of a Trio, and push it onto our Trio stack. In the following example, the Props instance allows the Trio to call back to its parent to perform this push; we’ll explore Props in detail in Part 3 of this series.</p><pre>class CounterViewModel : TrioViewModel {<br><br>  fun showDecimal(count: Double) {<br>    val trio = DecimalRouters.DecimalScreen.createTrio(DecimalArgs(count))<br>    props.pushScreen(trio)<br>  }<br>}</pre><p>If we want to instead start a Trio in a new activity, the Router also provides a function to create an intent for a new activity that wraps the Trio instance; we can then start it from the ViewModel using Trio’s activity mechanism, as discussed earlier.</p><pre>class CounterViewModel : TrioViewModel {<br><br>  fun showDecimal(count: Double) = viewModelScope.launch {<br>    val activity = awaitActivity()<br>    val intent = DecimalRouters.DecimalScreen<br>                    .newIntent(activity, DecimalArgs(count))<br>    <br>    activity.startActivity(intent)<br>  }<br>}</pre><p>When a Trio is started in a new activity, we simply need to extract the Parcelable Trio instance from the intent, and show it at the root of the Activity’s content.</p><pre>class TrioActivity : ComponentActivity() {<br>  override fun onCreate(savedInstanceState: Bundle?) {<br>    super.onCreate(savedInstanceState)<br><br>    val trio = intent.parseTrio()<br>    setContent {<br>      ShowTrio(trio)<br>    }<br>  }<br>}</pre><p>We can also start activities for a result by defining a Result type on the router.</p><pre>class DecimalRouters : RouterDeclarations() {<br><br>  data class DecimalResult(val count: Double)<br><br>  object DecimalScreen : TrioRouter&lt;DecimalArgs, …, DecimalResult&gt;<br>}</pre><p>In this case, the ViewModel contains a “launcher” property, which is used to start the new activity.</p><pre>class CounterViewModel : TrioViewModel {<br><br>  val decimalLauncher = DecimalScreen.createResultLauncher { result -&gt;<br>    setState {<br>      copy(count = result.count)<br>    }<br>  }<br><br>  fun showDecimal(count: Double) {<br>    decimalLauncher.startActivityForResult(DecimalArgs(count))<br>  }<br>}</pre><p>For example, if the user adjusts the decimals on the decimal screen, we could return the new count to update our state in the counter. The lambda argument to the launcher allows us to handle the result when the decimal screen returns, which we can then use to update the state. This furthers our goal of centralizing all navigation in the ViewModel, while guaranteeing type safety.</p><blockquote>Our Router system offers other nice features in addition to modularization, like interceptor chains in the Router resolution providing intermediary screens before showing the final Trio destination. We use this to redirect users to the login page when required, and also to show a loading page if a dynamic feature needs to be downloaded first.</blockquote><h4>Fragment Interop</h4><p>Making Trio screens interoperable with our existing Fragment screens was very important to us. Our migration to Trio is a years-long effort, and Trios and Fragments need to easily coexist.</p><p>Our approach to interoperability is twofold. First, if a Fragment and Trio don’t need to dynamically share information while created (i.e., they only take initial arguments and return a result), then it is easiest to start a new activity when transitioning between a Fragment and a Trio. Both architecture types can be easily started in a new activity with Arguments, and can optionally return a result when finished, so it is very easy to navigate between them this way.</p><p>Alternatively, if a Trio and Fragment screen need to share data between themselves while the screens are both active (i.e., the equivalent of Props with Trio), or they need to share complex data that is too large to pass with Arguments, then the Trio can be nested within an “Interop Fragment”, and the two Fragments can be shown in the same activity. The Fragments can communicate via a shared ViewModel, similar to how Fragments normally share ViewModels with Mavericks.</p><p>Our Router object makes it easy to create and show a Trio from another Fragment, with a single function call:</p><pre>class LegacyFragment : MavericksFragment {<br><br>  fun showTrioScreen() {        <br>    showFragment(<br>      CounterRouters<br>             .CounterScreen<br>             .newInteropFragment(SharedCounterViewModelPropsAdapter::class)<br>    )<br>  }<br>}</pre><p>The Router creates a shell Fragment and renders the Trio inside of it. An optional adapter class, the SharedCounterViewModelPropsAdapter in the above example, can be passed to the Fragment to specify how the Trio will communicate with Mavericks ViewModels used by other Fragments in the activity. This adapter allows the Trio to specify which ViewModels it wants to access, and creates a StateFlow that converts those ViewModel states into the Props class that the Trio consumes.</p><pre>class SharedCounterViewModelPropsAdapter : LegacyViewModelPropsAdapter&lt;SharedCounterScreenProps&gt; {<br>    <br>override suspend fun createPropsStateFlow(<br>  legacyViewModelProvider: LegacyViewModelProvider,<br>  navController: NavController&lt;SharedCounterScreenProps&gt;,<br>  scope: CoroutineScope<br>): StateFlow&lt;SharedCounterScreenProps&gt; {<br>         <br>  // Look up an activity view model<br>  val sharedCounterViewModel: SharedCounterViewModel = legacyViewModelProvider.getActivityViewModel()<br>         <br>  // You can look up multiple view models if necessary<br>  val fragmentClickViewModel: SharedCounterViewModel = legacyViewModelProvider.requireExistingViewModel(viewModelKey = {<br>    SharedCounterViewModelKeys.fragmentOnlyCounterKey<br>  })<br><br>  // Combine state updates into Props for the Trio, <br>  // and return as a StateFlow. This will be invoked anytime<br>  // any state flow has a new state object.<br>  return combine(sharedCounterViewModel.stateFlow, fragmentClickViewModel.stateFlow) { sharedState, fragmentState -&gt;<br>            SharedCounterScreenProps(<br>                navController = navController,<br>                sharedClickCount = sharedState.count,<br>                fragmentClickCount = fragmentState.count,<br>                increaseSharedCount = {<br>                    sharedCounterViewModel.increaseCounter()<br>                }<br>            )<br>    }.stateIn(scope)<br>  }<br>}</pre><h4>Conclusion</h4><p>In this article, we discussed how navigation works in Trio. We use some unique approaches, such as our custom routing system, providing access to activities in a ViewModel, and storing Trios in the ViewModel State to achieve our goals of modularization, interoperability, and making it simpler to reason about navigation logic.</p><p>Stay tuned for Part 3, where we will explain how Trio’s Props enable dynamic communication between screens.</p><p>And if this sounds like the kind of challenge you love working on, check out <a href="https://careers.airbnb.com/">open roles</a> — we’re hiring!</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=fe836013a798" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/introducing-trio-part-ii-fe836013a798">Introducing Trio | Part II</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/introducing-trio-part-ii-fe836013a798</link>
      <guid>https://medium.com/airbnb-engineering/introducing-trio-part-ii-fe836013a798</guid>
      <pubDate>Thu, 04 Apr 2024 18:52:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Introducing Trio | Part I]]></title>
      <description><![CDATA[<h4>A three part series on how we built a Compose based architecture with Mavericks in the Airbnb Android app</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Iiod_-uJ2bHY5BSbXyStfg.jpeg"></figure><p><strong>By:</strong> <a href="https://www.linkedin.com/in/eli-hart-54a4b975/">Eli Hart</a>, <a href="https://www.linkedin.com/in/schwabben/">Ben Schwab</a>, <a href="https://www.linkedin.com/in/yvonnejwong">Yvonne Wong</a></p><p>At Airbnb, we have developed an Android framework for Jetpack Compose screen architecture, which we call Trio. Trio is built on our open-source library Mavericks, which it leverages to maintain both navigation and application state within the ViewModel.</p><p>Airbnb began development of Trio more than two years ago, and has been using it in production for over a year and a half. It is powering a significant portion of our production screens in Airbnb’s Android app, and has enabled our engineers to create features in 100% Compose UI.</p><p>In this blog post series, we will look at how Mavericks can be used in modern, Compose based applications. We will discuss the challenges of Compose-based architecture and how Trio has attempted to solve them. This will include an exploration of concepts such as:</p><ul><li>Type-safe navigation between feature modules</li><li>Storing navigation state in a ViewModel</li><li>Communication between Compose-based screens, including opening screens for results and two-way communication between screens</li><li>Compile-time validation of navigation and communication interfaces</li><li>Developer tools created to support Trio workflows</li></ul><p>This series is split into three parts. Part 1 (this blog post) covers Trio’s high-level architecture. Stay tuned for Part 2, which will detail Trio’s navigation system, and Part 3, which will examine how Trio uses Props for communication between screens.</p><h4>Background on Mavericks</h4><p>To understand Trio’s architecture, it’s important to know the basics of Mavericks, which Trio is built on top of. Airbnb originally open sourced <a href="https://github.com/airbnb/mavericks">Mavericks</a> in 2018 to simplify and standardize how state is managed in a Jetpack ViewModel. Check out <a href="https://medium.com/airbnb-engineering/introducing-mvrx-android-on-autopilot-552bca86bd0a">this post</a> from the initial Mavericks (“MvRx”) launch for a deeper dive.</p><p>Used in virtually all the hundreds of screens in Airbnb’s Android app (and by many other companies too!), Mavericks is a state management library that is decoupled from the UI, and can be used with any UI system. The core concept is that screen UI is modeled as a function of state. This ensures that even the most complex screen can be rendered in a way that’s thread safe, independent of the order of events leading up to it, and easy to reason about and test.</p><p>To achieve this, Mavericks enforces the pattern that all data exposed by the ViewModel must be contained within a single MavericksState data class. In a simple Counter example, the state would contain the current count.</p><pre>data class CounterState(<br>  val count: Int = 0<br>) : MavericksState</pre><p>State properties can only be updated in the ViewModel via calls to setState. The setState function takes a “reducer” lambda, which, given a previous state, outputs a new state. We can use a reducer to increment the count by simply adding 1 to the previous value.</p><pre>class CounterViewModel : MavericksViewModel&lt;CounterState&gt;(...) {<br>  fun incrementCount() {<br>    setState {<br>      // this = previous state<br>      this.copy(count = count + 1)<br>    }<br>  }<br>}</pre><p>The base MavericksViewModel enqueues all calls to setState and runs them serially in a background thread. This guarantees thread safety when changes are made in multiple places at once, and ensures that changes to multiple properties in the state are atomic, so the UI never sees a state that is only partially updated.</p><p>MavericksViewModel exposes state changes via a coroutine Flow property. When paired with reactive UI, like Compose, we can collect the latest state value and guarantee that the UI is updated with every state change.</p><pre>counterViewModel.stateFlow.collectAsState().count</pre><p>This unidirectional cycle can be visualized with the following diagram:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*uupeoNHeomWXsIcr"></figure><h4>Challenges with Fragment-based architecture</h4><p>While Mavericks works well for state management, we were still experiencing some challenges with Android UI development, stemming from the fact that we were using a <a href="https://developer.android.com/reference/android/app/Fragment">Fragment</a>-based architecture integrated with Mavericks. With this approach, ViewModels are mainly scoped to the Activity and shared between Fragments via injection. Fragment views are updated by state changes from the ViewModel, and call back to the ViewModel to make state changes. The Fragment Manager manages navigation independently when Fragments need to be pushed or popped.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*wVXV6H1MbpWpoeJ8"></figure><p>Due to this architecture, we were running up against some ongoing difficulties, which became the motivation for building Trio.</p><ol><li><strong>Scoping</strong> — Sharing ViewModels between multiple Fragments relies on the implicit injection of the ViewModel. Thus, it isn’t clear which Fragment is responsible for creating the Activity ViewModel originally, or for providing the initial arguments to it.</li><li><strong>Communication</strong> — It’s difficult to share data between Fragments directly and with type safety. Again, because ViewModels are injected, it’s hard to have them communicate directly, and we don’t have good control over the ordering of their creation.</li><li><strong>Navigation</strong> — Navigation is done via the Fragment Manager and must happen in the Fragment. However, state changes are done in the ViewModel. This leads to synchronization problems between ViewModel and navigation states. It’s hard to coordinate if-then scenarios like making a navigation call only after updating a state value in the ViewModel.</li><li><strong>Testability</strong> — It’s difficult to isolate the UI for testing because it is wrapped in the Fragment. Screenshot tests are prone to flakiness and a lot of indirection is required for mocking the ViewModel state, because ViewModels are injected into the Fragment with property delegates.</li><li><strong>Reactivity</strong> — Mavericks provides a unidirectional state flow to the View, which is helpful for consistency and testing, but the View system doesn’t lend itself well to reactive updates to state changes, and it can be difficult or inefficient to update the view incrementally on each state change.</li></ol><p>While some of these problems could have been mitigated by using a better Fragment based architecture, we found that Fragments were overall too limiting with Compose and decided to move away from them entirely.</p><h4>Why we built Trio</h4><p>In 2021, our team began to explore <a href="https://android-developers.googleblog.com/2022/05/airbnb-uses-jetpack-compose.html">adopting Jetpack Compose</a> and completely transitioning away from Fragments. By fully embracing Compose, we could better prepare ourselves for future Android developments and eliminate years of accumulated tech debt.</p><p>Continuing to use Mavericks was important to us because we have a large amount of internal experience with it, and we didn’t want to further complicate an architectural migration by also changing our state management approach. We saw an opportunity to rethink how Mavericks could support a modern Android application, and address problems we encountered with our previous architecture</p><p>With Fragments, we struggled to guarantee type safe communication between screens at runtime. We wanted to be able to codify the expectations about how ViewModels are used and shared, and what interfaces look like between screens.</p><p>We also didn’t feel our needs were fully met by the Jetpack Navigation component, especially given our heavily modularized code base and large app. The Navigation component is <a href="https://developer.android.com/jetpack/compose/navigation#type-safety">not type safe</a>, requires defining the navigation graph in a single place, and doesn’t allow us to co-locate state in our ViewModel. We looked for a new architecture that could provide better type safety and modularization support.</p><p>Finally, we wanted an architecture that would improve testability, such as more stable screenshot and UI tests, and simpler navigation testing.</p><p>We considered the open source libraries Workflow and RIBs, but opted not to use them because they were not Compose-first and were not compatible with Mavericks and our other pre-existing internal frameworks.</p><p>Given these requirements, our decision was to develop our own solution, which we named Trio.</p><h4>Trio Architecture</h4><p>Trio is an opinionated framework for building features. It helps us to define and manage boundaries and state in Compose UI. Trio also standardizes how state is hoisted from Compose UI and how events are handled, enforcing unidirectional data flow with Mavericks. The design was inspired by Square’s <a href="https://github.com/square/workflow">Workflow</a> library; Trio differs in that it was designed specifically for Compose and uses Mavericks ViewModels for managing state and events.</p><p>Self-contained blocks are called “Trios”, named for the three main classes they contain. Each Trio has its own ViewModel, State, and UI, and can communicate with and be nested in other Trios. The following diagram represents how these components work together. The ViewModel makes changes to state via Mavericks reducers, the UI receives the latest state value to render, and events are routed back to the ViewModel for further state updates.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*NyLOH4OpT6k9DR1b"></figure><p>If you’re already familiar with Mavericks this pattern should look very similar! The ViewModel and State usage is very similar to what we did with Fragments. What’s new is how we embed the ViewModels in Compose UI and add Routing and Props based communication via Trio.</p><p>Trios are nested to form custom, flexible navigation hierarchies. “Parent” Trios create child Trios with initial arguments through a Router, and store those children in their State. The parent can then communicate dynamically with its children through a flow of Props, which provide data, dependencies, and functional callbacks.</p><p>The framework helps us to guarantee type safety when navigating and communicating between Trios, especially across module boundaries.</p><p>Each Trio can be tested individually by instantiating it with mocked arguments, State, and Props. Coupled with Compose’s state-based rendering and Maverick’s immutable state patterns, this provides controlled and deterministic testing environments.</p><h4>The Trio Class</h4><p>Creating a new Trio implementation requires subclassing the Trio base class. The Trio class is typed to define Args, Props, State, ViewModel, and UI; this allows us to guarantee type-safe navigation and inter-screen communication.</p><pre>class CounterScreen : Trio&lt;<br>  CounterArgs, <br>  CounterProps, <br>  CounterState,     <br>  CounterViewModel, <br>  CounterUI<br>&gt;</pre><p>A Trio is created with either an initial set of arguments or an initial state, which are wrapped in a sealed class called the Initializer. In production, the Initializer will only contain Args passed from another screen, but in development we can seed the Initializer with mock state so that the screen can be loaded standalone, independent of the normal navigation hierarchy.</p><pre>class CounterScreen(<br>  initializer: Initializer&lt;CounterArgs, CounterState&gt;<br>) </pre><p>Then, in our subclass body, we define how we want to create our State, ViewModel, and UI, given the starting values of Args and Props.</p><p>Args and Props both provide input data, with the difference being that Args are static while Props are dynamic. Args guarantee the stability of static information, such as IDs used to start a screen, while Props allow us to subscribe to data that may change over time.</p><pre>override fun createInitialState(args: CounterArgs, props:  CounterProps) {<br>  return CounterState(args.count)<br>}</pre><p>Trio provides an initializer to create a new ViewModel instance, passing necessary information like the Trio’s unique ID, a Flow of Props, and a reference to the parent Activity. Dependencies from the application’s dependency graph can also be also passed to the ViewModel through its constructor.</p><pre>override fun createViewModel(<br>  initializer: Initializer&lt;CounterProps, CounterState&gt;<br>) {<br>  return CounterViewModel(initializer)<br>}</pre><p>Finally, the UI class wraps the composable code used to render the Trio. The UI class receives a flow of the latest State from the ViewModel, and also uses the ViewModel reference to call back to it when handling UI events.</p><pre>override fun createUI(viewModel: CounterViewModel ): CounterUI {<br>  return CounterUI(viewModel)<br>}</pre><p>We like that grouping all of these factory functions in the Trio class makes it explicit how each class is created, and standardizes where to look to understand dependencies. However, it can also feel like boilerplate. As an improvement, we often use reflection to create the UI class, and we use assisted inject to automate creation of the ViewModel with Dagger dependencies.</p><p>The resulting Trio declaration as a whole looks like this:</p><pre>class CounterScreen(<br>  initializer: Initializer&lt;CounterArgs, CounterState&gt;<br>) : Trio&lt;<br>  CounterArgs, <br>  CounterProps, <br>  CounterState,     <br>  CounterViewModel, <br>  CounterUI<br>&gt;(initializer) {<br><br>  override fun createInitialState(CounterArgs, CounterProps) {<br>    return CounterState(args.count)<br>  }<br>}</pre><h4>The UI Class</h4><p>The Trio’s UI class implements a single Composable function named “Content”, which determines the UI that the Trio shows. Additionally, the Content function has a “TrioRenderScope” receiver type. This is a Compose animation scope that allows us to customize the Trio’s animations when it is displayed.</p><pre>class CounterUI(<br>  override val viewModel: CounterViewModel<br>) : UI&lt;CounterState, CounterViewModel&gt; {<br>   <br>  @Composable<br>  override fun TrioRenderScope.Content(state: CounterState) {<br>    Column {<br>      TopAppBar()<br>      Button(<br>        text = state.count,<br>        modifier = Modifier.clickable {<br>          viewModel.incrementCount()<br>        }<br>      )<br>      ...<br>    }<br>  }<br>}</pre><p>The Content function is recomposed every time the State from the ViewModel changes. The UI directs all UI events, such as clicks, back to the ViewModel for handling.</p><p>This design enforces unidirectional data flow, and testing the UI is easy because it is decoupled from the logic of state changes and event handling. It also standardizes how Compose state is hoisted for consistency across screens, while removing the boilerplate of setting up access to the ViewModel’s state flow.</p><h4>Rendering a Trio</h4><p>Given a Trio instance, we can render it by invoking its Content function, which uses the previously mentioned factory functions to create initial values of the ViewModel, State, and UI. The state flow is collected from the ViewModel and passed to the UI’s Content function. The UI is wrapped in a Box to respect the constraints and modifier of the caller.</p><pre>@Composable<br>internal fun TrioRenderScope.Content(modifier: Modifier = Modifier) {<br>  key(trioId) {<br>    val activity = LocalContext.current as ComponentActivity<br><br>    val viewModel = remember {<br>      getOrCreateViewModel(activity)<br>    }<br><br>    val ui = remember { createUI(viewModel) }<br><br>    val state = viewModel.stateFlow<br>                    .collectAsState(viewModel.currentState).value<br><br>    Box(propagateMinConstraints = true, modifier = modifier) {<br>      ui.Content(state = state)<br>    }<br>  }<br>}</pre><p>To enable customizing entry and exit animations, the Content function also uses a TrioRenderScope receiver; this wraps an implementation of Compose’s <a href="https://developer.android.com/reference/kotlin/androidx/compose/animation/AnimatedVisibilityScope">AnimatedVisibilityScope</a> which displays the Content. A helper function is used to coordinate this.</p><pre>@Composable<br>fun ShowTrio(trio: Trio, modifier: Modifier) {<br>  AnimatedVisibility(<br>    visible = true,<br>    enter = EnterTransition.None,<br>    exit = ExitTransition.None<br>  ) {<br>    val animationScope = TrioRenderScopeImpl(this)<br>    trio.Content(modifier, animationScope)<br>  }<br>}</pre><p>In practice, the actual implementation of Trio.Content is quite a bit more complex because of additional tooling and edge cases we want to support — such as tracking the Trio’s lifecycle, managing saved state, and mocking the ViewModel when shown within a screenshot test or IDE preview.</p><h4>Conclusion</h4><p>In this introduction to Trio we discussed Airbnb’s background with Mavericks and Fragments, and why we built Trio to transition to a Jetpack Compose-based architecture. We presented an overview of Trio’s architecture, and looked at core components such as the Trio class and UI class.</p><p>In upcoming articles, we will continue this three-part series by detailing how navigation works with Trio, and how Trio’s Props allow dynamic communication between screens. And if this work sounds interesting to you, check out <a href="https://careers.airbnb.com/">open roles</a> at Airbnb!</p><img src="https://medium.com/_/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=7f5017a1a903" width="1" height="1" alt=""><hr><p><a href="https://medium.com/airbnb-engineering/introducing-trio-part-i-7f5017a1a903">Introducing Trio | Part I</a> was originally published in <a href="https://medium.com/airbnb-engineering">The Airbnb Tech Blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></description>
      <link>https://medium.com/airbnb-engineering/introducing-trio-part-i-7f5017a1a903</link>
      <guid>https://medium.com/airbnb-engineering/introducing-trio-part-i-7f5017a1a903</guid>
      <pubDate>Thu, 28 Mar 2024 18:02:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Migrating Our iOS Build System from Buck to Bazel]]></title>
      <description><![CDATA[<div class="gk gl gm gn go"><div class="ab ca"><div class="ch bg fw fx fy fz"><div><div class="hs ht hu hv hw"></div><figure class="na nb nc nd ne nf mx my paragraph-image"><div role="button" tabindex="0" class="ng nh fg ni bg nj"><div class="mx my mz"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*sglF3oUVjjvXb8yo 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*sglF3oUVjjvXb8yo 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*sglF3oUVjjvXb8yo 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*sglF3oUVjjvXb8yo 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*sglF3oUVjjvXb8yo 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*sglF3oUVjjvXb8yo 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*sglF3oUVjjvXb8yo 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*sglF3oUVjjvXb8yo 640w, https://miro.medium.com/v2/resize:fit:720/0*sglF3oUVjjvXb8yo 720w, https://miro.medium.com/v2/resize:fit:750/0*sglF3oUVjjvXb8yo 750w, https://miro.medium.com/v2/resize:fit:786/0*sglF3oUVjjvXb8yo 786w, https://miro.medium.com/v2/resize:fit:828/0*sglF3oUVjjvXb8yo 828w, https://miro.medium.com/v2/resize:fit:1100/0*sglF3oUVjjvXb8yo 1100w, https://miro.medium.com/v2/resize:fit:1400/0*sglF3oUVjjvXb8yo 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h2 id="2af3" class="nl nm gr be nn no np dx nq nr ns dz nt nu nv nw nx ny nz oa ob oc od oe of og bj"><strong class="al">How Airbnb achieved a smooth and transparent migration from Buck to Bazel on iOS, with minimal interference to developer workflows</strong></h2><p id="6451" class="pw-post-body-paragraph oh oi gr oj b ok ol om on oo op oq or nu os ot ou ny ov ow ox oc oy oz pa pb gk bj"><strong class="oj gs">By:</strong> <a class="af pc" href="https://www.linkedin.com/in/qyangx/" rel="noopener ugc nofollow" target="_blank">Qing Yang</a>, <a class="af pc" href="https://www.linkedin.com/in/andybartholomew1/" rel="noopener ugc nofollow" target="_blank">Andy Bartholomew</a></p><p id="fb3e" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">At Airbnb, we are committed to providing the best experience for our engineers. To offer a cohesive and efficient build experience across all platforms, we’ve decided to adopt Bazel as our build system. Bazel is a robust build system widely utilized in the industry. In alignment with Airbnb’s tech initiatives, both our backend and frontend teams initiated the migration process to Bazel. In the first Bazel post, we start with our iOS development migrating from Buck to Bazel.</p><p id="cdbb" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">We’ll describe the migration approach which involved two main pieces of work: migrating the build configuration and migrating the IDE integration. Such a transition can potentially disrupt engineers’ workflows or hinder the development of new features, but we were able to successfully migrate them without disrupting the day-to-day developer experience. Our aim is to help others who are currently undergoing or planning a similar migration.</p><h1 id="96ae" class="pi nm gr be nn pj pk pl nq pm pn po nt pp pq pr ps pt pu pv pw px py pz qa qb bj">Migrating the Build Configuration</h1><p id="f6d4" class="pw-post-body-paragraph oh oi gr oj b ok ol om on oo op oq or nu os ot ou ny ov ow ox oc oy oz pa pb gk bj">When it comes to build configuration, Buck and Bazel exhibit significant similarities. They share a comparable directory structure, employ similar command line invocation, and, importantly, both utilize the <a class="af pc" href="https://bazel.build/rules/language" rel="noopener ugc nofollow" target="_blank">Starlark language</a>. These similarities present an opportunity for configuration sharing between the two build systems. This would allow us to reuse our Buck configurations in Bazel, while avoiding slowdowns during the “overlap” phase when we were in the process of migrating and still actively using both build systems.</p><p id="1df1" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Unfortunately, there’s a major problem: Buck and Bazel employ distinct rules with different parameters. For instance, Buck offers rules such as <code class="cw qc qd qe qf b"><a class="af pc" href="https://buck.build/rule/apple_library.html" rel="noopener ugc nofollow" target="_blank">apple_library</a></code> and <code class="cw qc qd qe qf b"><a class="af pc" href="https://buck.build/rule/apple_binary.html" rel="noopener ugc nofollow" target="_blank">apple_binary</a></code>, whereas Bazel, depending on the external rule sets, features rules like <code class="cw qc qd qe qf b"><a class="af pc" href="https://github.com/bazelbuild/rules_swift/blob/master/doc/rules.md#swift_library" rel="noopener ugc nofollow" target="_blank">swift_library</a></code> and <code class="cw qc qd qe qf b"><a class="af pc" href="https://github.com/bazel-ios/rules_ios/blob/master/docs/framework_doc.md#apple_framework" rel="noopener ugc nofollow" target="_blank">apple_framework</a></code>. Even in cases where the two systems have rules with the same name, such as <code class="cw qc qd qe qf b">genrule</code>, the syntax for configuring those rules is often unalike. The different design philosophies of these two systems result in various incompatibilities as well. For instance, Bazel doesn’t have the <code class="cw qc qd qe qf b">read_config</code> function to read command line options in a macro.</p><h2 id="3537" class="nl nm gr be nn no np dx nq nr ns dz nt nu nv nw nx ny nz oa ob oc od oe of og bj">Hiding the Differences with rules_shim</h2><p id="5c67" class="pw-post-body-paragraph oh oi gr oj b ok ol om on oo op oq or nu os ot ou ny ov ow ox oc oy oz pa pb gk bj">After conducting an in-depth analysis of both Buck and Bazel, we devised a comprehensive architecture for the build configuration to leverage the similarities and address the differences between each system.</p><figure class="qh qi qj qk ql nf mx my paragraph-image"><div role="button" tabindex="0" class="ng nh fg ni bg nj"><div class="mx my qg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*Vj2x8a5sw_cNtukx 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*Vj2x8a5sw_cNtukx 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*Vj2x8a5sw_cNtukx 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*Vj2x8a5sw_cNtukx 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*Vj2x8a5sw_cNtukx 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*Vj2x8a5sw_cNtukx 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*Vj2x8a5sw_cNtukx 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*Vj2x8a5sw_cNtukx 640w, https://miro.medium.com/v2/resize:fit:720/0*Vj2x8a5sw_cNtukx 720w, https://miro.medium.com/v2/resize:fit:750/0*Vj2x8a5sw_cNtukx 750w, https://miro.medium.com/v2/resize:fit:786/0*Vj2x8a5sw_cNtukx 786w, https://miro.medium.com/v2/resize:fit:828/0*Vj2x8a5sw_cNtukx 828w, https://miro.medium.com/v2/resize:fit:1100/0*Vj2x8a5sw_cNtukx 1100w, https://miro.medium.com/v2/resize:fit:1400/0*Vj2x8a5sw_cNtukx 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qm fc qn mx my qo qp be b bf z dt">The build configuration layers</figcaption></figure><p id="32ac" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">At the core of this architecture lies the <code class="cw qc qd qe qf b">rules_shim</code> layer, which introduces two sets of rules: one for Buck and another for Bazel. These rule sets act as wrappers around the native and external rules, offering unified interfaces to the layers above.</p><p id="d035" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">How does <code class="cw qc qd qe qf b">rules_shim</code> work, exactly? By making use of local repositories, we can point the <code class="cw qc qd qe qf b">rules_shim</code> repository to different implementations depending on the build system.</p><p id="6430" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">This is what the result looks like in Buck’s <code class="cw qc qd qe qf b">.buckconfig</code>:</p><pre class="qh qi qj qk ql qq qf qr bo qs ba bj">[repositories]<br />  rules_shim = rules_shim/buck[buildfile]<br />  name = BUILD</pre><p id="384e" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Note that we’ve also configured Buck to use <code class="cw qc qd qe qf b">BUILD</code> as the config file, and renamed the existing <code class="cw qc qd qe qf b">BUCK</code> files to <code class="cw qc qd qe qf b">BUILD</code>, so the same configuration can be recognized by both Buck and Bazel.</p><p id="4813" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">In Bazel’s <code class="cw qc qd qe qf b">WORKSPACE</code>, we do the following:</p><pre class="qh qi qj qk ql qq qf qr bo qs ba bj">local_repository(<br />  name = "rules_shim",<br />  path = "rules_shim/bazel"<br />)</pre><p id="299c" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">In a regular <code class="cw qc qd qe qf b">BUILD</code> file, we use <code class="cw qc qd qe qf b">my_library</code> to wrap around the native rules and provide the same interface for each application:</p><pre class="qh qi qj qk ql qq qf qr bo qs ba bj">load("@rules_shim//:defs.bzl", "my_library", …)</pre><p id="93d1" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">The app-specific rules layer only needs to know the interface, not the implementation. As a result, whenever we execute Buck or Bazel commands, the build system is able to retrieve the corresponding implementation from the <code class="cw qc qd qe qf b">rules_shim</code> layer. A notable advantage of this design is that we can easily remove the <code class="cw qc qd qe qf b">rules_shim/buck</code> after the migration.</p><h2 id="fe0d" class="nl nm gr be nn no np dx nq nr ns dz nt nu nv nw nx ny nz oa ob oc od oe of og bj">Unifying the <code class="cw qc qd qe qf b">genrule</code> interface</h2><p id="3d77" class="pw-post-body-paragraph oh oi gr oj b ok ol om on oo op oq or nu os ot ou ny ov ow ox oc oy oz pa pb gk bj">Within our iOS codebase, we heavily rely on generated code to manage boilerplate and reduce the maintenance burden for engineers. Given the different syntax for genrule scripts between the two build systems, we also designed a unified interface for <code class="cw qc qd qe qf b">genrule</code>. As a result, the same genrule script can function across both build systems. As you may have guessed, the conversion process is implemented in the <code class="cw qc qd qe qf b">rules_shim</code> layer.</p><figure class="qh qi qj qk ql nf mx my paragraph-image"><div role="button" tabindex="0" class="ng nh fg ni bg nj"><div class="mx my qy"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 640w, https://miro.medium.com/v2/resize:fit:720/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 720w, https://miro.medium.com/v2/resize:fit:750/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 750w, https://miro.medium.com/v2/resize:fit:786/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 786w, https://miro.medium.com/v2/resize:fit:828/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*UbI2b1QOTDU9YvYbPyDF3w@2x.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qm fc qn mx my qo qp be b bf z dt"><em class="qz">We designed the predefined variables in the unified </em>genrule<em class="qz"> interface.</em></figcaption></figure><h2 id="3c3f" class="nl nm gr be nn no np dx nq nr ns dz nt nu nv nw nx ny nz oa ob oc od oe of og bj">Replacing read_config with select</h2><p id="e1cb" class="pw-post-body-paragraph oh oi gr oj b ok ol om on oo op oq or nu os ot ou ny ov ow ox oc oy oz pa pb gk bj">Conditional configuration is unavoidable, because there are always different variants of a built product, such as debug builds and release builds. Buck provides a function called <code class="cw qc qd qe qf b">read_config</code> that reads command line options in a macro, while Bazel doesn’t have this due to the system’s strict <a class="af pc" href="https://bazel.build/extending/concepts#evaluation-model" rel="noopener ugc nofollow" target="_blank">separation of loading phase</a>. It’s worth noting that Buck does support the <code class="cw qc qd qe qf b"><a class="af pc" href="https://bazel.build/reference/be/functions#select" rel="noopener ugc nofollow" target="_blank">select</a></code> function, although it’s undocumented. We have migrated all instances of <code class="cw qc qd qe qf b">read_config</code> to <code class="cw qc qd qe qf b">select</code>-based conditions.</p><pre class="qh qi qj qk ql qq qf qr bo qs ba bj">deps = select({<br />  "//:DebugBuild": non_production_deps,<br />  "//:ReleaseBuild": [],<br />  # SELECT_DEFAULT is defined in rules_shim to accommodate <br />  # the different default strings used by Buck and Bazel<br />  SELECT_DEFAULT: non_production_deps,<br />}),</pre><p id="3819" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Overall, <strong class="oj gs">this design achieved the utilization of a single build configuration for both build systems</strong>, with minimal changes to our <code class="cw qc qd qe qf b">BUILD</code> files themselves. In practice, iOS engineers at Airbnb rarely need to manually modify <code class="cw qc qd qe qf b">BUILD</code> files, which are automatically updated from an analysis of the underlying source code. However, in cases where it does occur, they can rely on the unified interface without needing to be aware of the specific underlying build system.</p><h1 id="025e" class="pi nm gr be nn pj pk pl nq pm pn po nt pp pq pr ps pt pu pv pw px py pz qa qb bj">Migrating the IDE Integration</h1><p id="31eb" class="pw-post-body-paragraph oh oi gr oj b ok ol om on oo op oq or nu os ot ou ny ov ow ox oc oy oz pa pb gk bj">iOS Engineers at Airbnb primarily interact with the build system through Xcode. Since first adopting Buck, we have been utilizing Buck-generated Xcode workspaces for local development. Over the years, we’ve developed various productivity-boosting features on top of this setup, including the <a class="af pc" rel="noopener" href="https://medium.com/airbnb-engineering/designing-for-productivity-in-a-large-scale-ios-application-9376a430a0bf">Dev App</a>, a small development application focused on a single module; Buck Local, which uses Buck instead of Xcode for building and leverages remote cache; and Focus Xcode workspace, which significantly improves IDE performance by loading only the modules being worked on.</p><p id="1155" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">In the Bazel ecosystem, multiple solutions exist for generating Xcode workspaces. However, at the time of our evaluation, none of them fully met our requirements. Additionally, any IDE integration needs to support not only building, but also editing, indexing, testing, and debugging. Given the proven track record and stability of our current workspace setup, we deemed the risk of adopting a completely new one to be exceedingly high. Hence, we decided to develop our own generator to create a workspace close to our existing setup. We chose <a class="af pc" href="https://github.com/yonaskolb/XcodeGen" rel="noopener ugc nofollow" target="_blank">XcodeGen</a>, a popular tool in this area, because it generates Xcode projects from a YAML configuration, serving as an abstraction layer to separate the build system implementation details.</p><figure class="qh qi qj qk ql nf mx my paragraph-image"><div role="button" tabindex="0" class="ng nh fg ni bg nj"><div class="mx my ra"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*nFo7iIagxopZAwX_ 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*nFo7iIagxopZAwX_ 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*nFo7iIagxopZAwX_ 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*nFo7iIagxopZAwX_ 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*nFo7iIagxopZAwX_ 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*nFo7iIagxopZAwX_ 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*nFo7iIagxopZAwX_ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*nFo7iIagxopZAwX_ 640w, https://miro.medium.com/v2/resize:fit:720/0*nFo7iIagxopZAwX_ 720w, https://miro.medium.com/v2/resize:fit:750/0*nFo7iIagxopZAwX_ 750w, https://miro.medium.com/v2/resize:fit:786/0*nFo7iIagxopZAwX_ 786w, https://miro.medium.com/v2/resize:fit:828/0*nFo7iIagxopZAwX_ 828w, https://miro.medium.com/v2/resize:fit:1100/0*nFo7iIagxopZAwX_ 1100w, https://miro.medium.com/v2/resize:fit:1400/0*nFo7iIagxopZAwX_ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qm fc qn mx my qo qp be b bf z dt">The flow of generating the Xcode project</figcaption></figure><p id="9b54" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">We implemented this migration process in three phases.</p><p id="ecc5" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Firstly, we utilized <code class="cw qc qd qe qf b">buck query</code> to gather all the necessary information from the codebase and generate an Xcode workspace, replacing the <code class="cw qc qd qe qf b"><a class="af pc" href="https://buck.build/command/project.html" rel="noopener ugc nofollow" target="_blank">buck project</a></code> command. This new workspace invoked <code class="cw qc qd qe qf b">buck build</code> during the build process. By keeping the build system unchanged, we were able to ensure compatibility and evaluate the performance of the new generator.</p><p id="b15b" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Secondly, we performed a parallel implementation in Bazel using <code class="cw qc qd qe qf b">bazel query</code> and <code class="cw qc qd qe qf b">bazel build</code>, incorporating a simple <code class="cw qc qd qe qf b">--bazel</code> option in the generation script that enables switching between the two build systems within Xcode. Apart from the build system, the user interface remained identical, ensuring that all IDE operations continued to function as before.</p><p id="2267" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Lastly, after a sufficient number of users opted for Bazel and all Bazel-powered features underwent extensive testing, we made the <code class="cw qc qd qe qf b">--bazel </code>option the default, finishing for a smooth transition to Bazel. Although we didn’t need to, we could easily roll back if issues had occurred. A few weeks later, we removed Buck support from the generated project.</p><p id="596d" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">The end result of this migration is impressive. Compared to the Buck-generated project(<code class="cw qc qd qe qf b">buck project</code>), the generation time with XcodeGen has been reduced by 60%, and the open time for Xcode has decreased by more than 70%. As a result, this new workspace setup received top rankings in an internal developer experience survey, showcasing the significant improvements achieved through this process.</p><h1 id="4a7c" class="pi nm gr be nn pj pk pl nq pm pn po nt pp pq pr ps pt pu pv pw px py pz qa qb bj">Completing the Migration and Looking Forward</h1><blockquote class="rb rc rd"><p id="e6cd" class="oh oi re oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">“All problems in computer science can be solved by another level of indirection.” — David Wheeler</p></blockquote><p id="3cd9" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Wherever we relied on Buck, we introduced a common interface abstraction and injected separate implementations to handle the differences between Buck and Bazel. Thanks to the “indirection” principle, we were able to test and update each implementation without dramatically rewriting the code, and we successfully transitioned from Buck to Bazel seamlessly across all use cases, including local development, CI testing, and releases. The migration process was executed without disrupting engineers’ workflows and, in fact, allowed us to deliver multiple new features, including SwiftUI Previews support.</p><p id="f4a1" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Since Bazel became our iOS build system, we have observed notable improvements in build times, particularly for incremental builds. This shift has enabled us to leverage shared infrastructure, such as remote cache, alongside other build platforms within Airbnb. Consequently, we have fostered increased collaboration across platforms.</p><p id="e16d" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Migrating our iOS build system is just the first of a number of Bazel migrations underway or completed at Airbnb. We have repos for JVM-based languages (Java/Kotlin/Scala), for JavaScript, and for Go, which are either using Bazel already, or will be in the future. We believe a single build tool across our entire codebase will allow us to more effectively leverage our investments in build tooling and training. In the future, we’ll be sharing lessons learned from these other Bazel migrations.</p></div></div></div><div class="ab ca rf rg rh ri" role="separator"><div class="gk gl gm gn go"><div class="ab ca"><div class="ch bg fw fx fy fz"><p id="ba4e" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">Qing Yang, the technical lead for the iOS Bazel migration, designed and implemented the configuration architecture and the new project generator. Andy Bartholomew led the migration of all tests and implemented the script abstraction layer. Xianwen Chen migrated and managed the release builds of Bazel.</p><p id="621f" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">We are immensely grateful for the invaluable support received from numerous Bazel experts at Airbnb. Special thanks to Janusz Kudelka for providing valuable advice and guidance on the subject.</p><p id="3e96" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">We also extend our appreciation to the Bazel iOS community for their various open source projects and the assistance they provided throughout our migration journey.</p><p id="eaa1" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj">If you are interested in joining us on our quest to make the best iOS app in the App Store, please see our <a class="af pc" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">careers</a> page for open iOS and Developer Platform roles.</p><p id="0273" class="pw-post-body-paragraph oh oi gr oj b ok pd om on oo pe oq or nu pf ot ou ny pg ow ox oc ph oz pa pb gk bj"><em class="re">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/migrating-our-ios-build-system-from-buck-to-bazel-ddd6f3f25aa3</link>
      <guid>https://medium.com/airbnb-engineering/migrating-our-ios-build-system-from-buck-to-bazel-ddd6f3f25aa3</guid>
      <pubDate>Thu, 01 Feb 2024 18:20:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Airbnb at KDD 2023]]></title>
      <description><![CDATA[<div><div class="hs ht hu hv hw"></div><figure class="nd ne nf ng nh ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb nc"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*Hko3nVtsyBbe7Yk1 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*Hko3nVtsyBbe7Yk1 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*Hko3nVtsyBbe7Yk1 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*Hko3nVtsyBbe7Yk1 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*Hko3nVtsyBbe7Yk1 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*Hko3nVtsyBbe7Yk1 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*Hko3nVtsyBbe7Yk1 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*Hko3nVtsyBbe7Yk1 640w, https://miro.medium.com/v2/resize:fit:720/0*Hko3nVtsyBbe7Yk1 720w, https://miro.medium.com/v2/resize:fit:750/0*Hko3nVtsyBbe7Yk1 750w, https://miro.medium.com/v2/resize:fit:786/0*Hko3nVtsyBbe7Yk1 786w, https://miro.medium.com/v2/resize:fit:828/0*Hko3nVtsyBbe7Yk1 828w, https://miro.medium.com/v2/resize:fit:1100/0*Hko3nVtsyBbe7Yk1 1100w, https://miro.medium.com/v2/resize:fit:1400/0*Hko3nVtsyBbe7Yk1 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="19ec" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">KDD (Knowledge and Data Mining) is a flagship conference in data science research. Hosted annually by a special interest group of the Association for Computing Machinery (ACM), it’s where you’ll learn about some of the most ground-breaking developments in data mining, knowledge discovery, and large-scale data analytics.</p><p id="ca25" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Airbnb had a significant presence at <a class="af om" href="https://kdd.org/kdd2023/" rel="noopener ugc nofollow" target="_blank">KDD 2023</a> with two papers accepted into the main conference proceedings and 11 talks and presentations. In this blog post, we’ll summarize our team’s contributions and share highlights from an exciting week of workshops, panel discussions, and more.</p><h1 id="c699" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Deep learning and search ranking</h1><p id="b079" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">Even though search ranking is a problem that researchers have been working on for decades, there are still many nuances to explore. For example, at Airbnb, guests are typically searching over a period of days or weeks, not minutes. And being a two-way marketplace, there are factors like the potential for hosts to cancel the booking that we’d like to account for in ranking.</p><p id="7d18" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><a class="af om" href="https://arxiv.org/abs/2305.18431" rel="noopener ugc nofollow" target="_blank">Optimizing Airbnb Search Journey with Multi-task Learning</a>, our paper accepted at KDD 2023, presents Journey Ranker, a new multi-task deep learning model. The core insight here is that for this kind of long-term search task, we want to optimize for intermediate steps in the user journey.</p><figure class="pr ps pt pu pv ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb pq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*kLAJ2bVBzKCyzpmg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*kLAJ2bVBzKCyzpmg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*kLAJ2bVBzKCyzpmg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*kLAJ2bVBzKCyzpmg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*kLAJ2bVBzKCyzpmg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*kLAJ2bVBzKCyzpmg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*kLAJ2bVBzKCyzpmg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*kLAJ2bVBzKCyzpmg 640w, https://miro.medium.com/v2/resize:fit:720/0*kLAJ2bVBzKCyzpmg 720w, https://miro.medium.com/v2/resize:fit:750/0*kLAJ2bVBzKCyzpmg 750w, https://miro.medium.com/v2/resize:fit:786/0*kLAJ2bVBzKCyzpmg 786w, https://miro.medium.com/v2/resize:fit:828/0*kLAJ2bVBzKCyzpmg 828w, https://miro.medium.com/v2/resize:fit:1100/0*kLAJ2bVBzKCyzpmg 1100w, https://miro.medium.com/v2/resize:fit:1400/0*kLAJ2bVBzKCyzpmg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="d740" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><em class="pw">CAPTION: The Journey Ranker base module assists guests in reaching positive milestones. There is also a Twiddler module that assists guests in avoiding negative milestones. The modules work off a shared feature representation of listing and guest context, and their output scores are combined.</em></p><p id="fbe5" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Because of its modular design, Journey Ranker can be used wheneverthere are positive or negative milestones to consider. We’ve implemented it in different Airbnb search and other products to drive improvements in business metrics.</p><p id="abff" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">We also co-presented <a class="af om" href="https://dcaitutorial.github.io/" rel="noopener ugc nofollow" target="_blank">a tutorial on Data-Centric AI</a> (DCAI). DCAI is a fast-growing field in deep learning, because as model design matures, innovation is being driven by data. We shared DCAI best practices and trends for developing training data, developing inference data, maintaining data, and creating benchmarks, with many examples from working with LLMs.</p><h1 id="6bb5" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Online experimentation and measurement</h1><p id="3517" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">Online experimentation (e.g., A/B testing) is a common way for organizations like Airbnb to make data-driven decisions. But high variance is frequently a challenge. For example, it’s hard to prove that a change in our search UX will drive value when bookings are infrequent and depend on a large number of interactions over a long period of time.</p><p id="d868" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Our paper <a class="af om" href="https://dl.acm.org/doi/pdf/10.1145/3580305.3599928" rel="noopener ugc nofollow" target="_blank">Variance Reduction Using In-Experiment Data: Efficient and Targeted Online Measurement for Sparse and Delayed Outcomes</a> presents two new methods for variance reduction that rely exclusively on in-experiment data:</p><ol class=""><li id="ce3f" class="no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol px py pz bj">A framework for a model-based leading indicator metric that continually estimates progress toward a delayed binary outcome.</li><li id="a132" class="no np gr nq b nr qa nt nu nv qb nx ny nz qc ob oc od qd of og oh qe oj ok ol px py pz bj">A counterfactual treatment exposure index that quantifies the amount a user is impacted by the treatment.</li></ol><p id="8026" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">In testing, both methods achieved a variance reduction of 50% or more. These techniques have greatly improved our experimentation efficiency and impact.</p><figure class="pr ps pt pu pv ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb qf"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*VLJeyMFXLGQAMQ_h 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*VLJeyMFXLGQAMQ_h 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*VLJeyMFXLGQAMQ_h 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*VLJeyMFXLGQAMQ_h 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*VLJeyMFXLGQAMQ_h 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*VLJeyMFXLGQAMQ_h 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*VLJeyMFXLGQAMQ_h 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*VLJeyMFXLGQAMQ_h 640w, https://miro.medium.com/v2/resize:fit:720/0*VLJeyMFXLGQAMQ_h 720w, https://miro.medium.com/v2/resize:fit:750/0*VLJeyMFXLGQAMQ_h 750w, https://miro.medium.com/v2/resize:fit:786/0*VLJeyMFXLGQAMQ_h 786w, https://miro.medium.com/v2/resize:fit:828/0*VLJeyMFXLGQAMQ_h 828w, https://miro.medium.com/v2/resize:fit:1100/0*VLJeyMFXLGQAMQ_h 1100w, https://miro.medium.com/v2/resize:fit:1400/0*VLJeyMFXLGQAMQ_h 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="c11e" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><em class="pw">With more than 50% variance reduction, the new model-based leading indicator metric (listing-view utility, on the right) aligns with the target uncancelled booking metric much better than other indicators such as listing-view with dates (on the left).</em></p><p id="3033" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Another interesting challenge in online experimentation is avoiding interference bias, which can happen when you have competition between your A/B test subjects. Airbnb presented a keynote talk on this topic at KDD’s <a class="af om" href="https://sites.google.com/view/kdd23onlinemarketplaces/home" rel="noopener ugc nofollow" target="_blank">2nd Workshop on Decision Intelligence and Analytics for Online Marketplaces</a>. As an example, if you ran an A/B test where group B saw lower booking prices, they might “cannibalize” the bookings from group A. There are two imperfect solutions: clustering (isolating the options for participants) and switchbacks (grouping participants by time intervals).</p><p id="711a" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Also at the workshop, we presented the paper <a class="af om" href="https://airbnb.tech/wp-content/uploads/sites/19/2023/12/CIKM.pdf" rel="noopener ugc nofollow" target="_blank">The Price is Right: Removing A/B Test Bias in a Marketplace of Expirable Goods</a>. This discusses the problem of lead-day bias: where items like concert tickets, air travel, and Airbnb bookings vary in price based on the distance from their expiration date. This can wreak havoc on A/B tests, and in the paper we present several mitigation techniques, such as limited rollout, smart overlapping of experiments, and Heterogeneous Treatment Effect (HTE) remixed estimator to correct for bias and accelerate R&amp;D process.</p><figure class="pr ps pt pu pv ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb qg"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*_BreW6LQyQTzs2XZ 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*_BreW6LQyQTzs2XZ 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*_BreW6LQyQTzs2XZ 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*_BreW6LQyQTzs2XZ 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*_BreW6LQyQTzs2XZ 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*_BreW6LQyQTzs2XZ 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*_BreW6LQyQTzs2XZ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*_BreW6LQyQTzs2XZ 640w, https://miro.medium.com/v2/resize:fit:720/0*_BreW6LQyQTzs2XZ 720w, https://miro.medium.com/v2/resize:fit:750/0*_BreW6LQyQTzs2XZ 750w, https://miro.medium.com/v2/resize:fit:786/0*_BreW6LQyQTzs2XZ 786w, https://miro.medium.com/v2/resize:fit:828/0*_BreW6LQyQTzs2XZ 828w, https://miro.medium.com/v2/resize:fit:1100/0*_BreW6LQyQTzs2XZ 1100w, https://miro.medium.com/v2/resize:fit:1400/0*_BreW6LQyQTzs2XZ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="ac91" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><em class="pw">Along with limited rollout and smart overlapping of experiments, HTE-remixed estimator can provide sufficiently robust estimation of the long-term experiment impact from the short-term result and significantly shorten the experiment run-time.</em></p><h1 id="6ee1" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Causal inference for marketing and user journey optimization</h1><p id="2052" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">In marketing, the million-dollar question is how much should you spend per channel? This can be reframed as a causal inference problem: how many incremental conversions does each channel drive?</p><p id="b7f8" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">When we look at marketing activities across Nielsen’s Designated Marketing Areas (DMAs) we find moderate to strong correlation across channels. This makes it hard to isolate the impact of one channel from another. In fact, when we include the correlated channels in the same regression, the coefficients flip signs for most channels, a clear sign of multicollinearity.</p><p id="90c9" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Existing solutions to multicollinearity, such as shrinkage estimators, principal component analysis, and partial linear regression, are particularly helpful for prediction problems but work less well for our use case where we need to maintain business interpretability while isolating causality. Our approach, described in the paper Hierarchical Clustering as <a class="af om" href="https://airbnb.tech/wp-content/uploads/sites/19/2023/12/31.KDD-Paper-Hierarchical-Clustering-As-a-Solution-to-Multicollinearity-%E2%80%93-Marketing-Application-as-an-Example.pdf" rel="noopener ugc nofollow" target="_blank">a Novel Solution to Multicollinearity</a>, is to hierarchically cluster DMAs based on their similarity in marketing impressions over time. With such clustering, cross-channel correlation dropped by up to 43% and the channel coefficients no longer flip signs.</p><figure class="pr ps pt pu pv ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb qh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*xCf2MRTm27kOyaEj 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*xCf2MRTm27kOyaEj 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*xCf2MRTm27kOyaEj 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*xCf2MRTm27kOyaEj 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*xCf2MRTm27kOyaEj 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*xCf2MRTm27kOyaEj 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*xCf2MRTm27kOyaEj 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*xCf2MRTm27kOyaEj 640w, https://miro.medium.com/v2/resize:fit:720/0*xCf2MRTm27kOyaEj 720w, https://miro.medium.com/v2/resize:fit:750/0*xCf2MRTm27kOyaEj 750w, https://miro.medium.com/v2/resize:fit:786/0*xCf2MRTm27kOyaEj 786w, https://miro.medium.com/v2/resize:fit:828/0*xCf2MRTm27kOyaEj 828w, https://miro.medium.com/v2/resize:fit:1100/0*xCf2MRTm27kOyaEj 1100w, https://miro.medium.com/v2/resize:fit:1400/0*xCf2MRTm27kOyaEj 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="70fe" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Not only does our method provide an intuitive and effective solution to multicollinearity, it also circumvents the need for complex transformation and preserves the interpretability of the data and the results throughout, empowering broad applications to causal inference problems.</p><p id="0302" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">We presented this paper at the new KDD workshop, <a class="af om" href="https://causal-machine-learning.github.io/kdd2023-workshop/" rel="noopener ugc nofollow" target="_blank">Causal Inference and Machine Learning in Practice: Use cases for Product, Brand, Policy, and beyond</a>. Airbnb’s Totte Harinen co-organized this workshop, which strongly resonated with KDD’s audience — it had 12 papers and four invited talks from 37 authors in 14 institutions.</p><p id="0a51" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">In addition, we were invited to present two talks and one poster at KDD’s <a class="af om" href="https://sites.google.com/view/kdd-workshop-2023" rel="noopener ugc nofollow" target="_blank">2nd Workshop on End-End Customer Journey Optimization</a>, and joined the workshop’s panel discussion. One of these talks covered CLV (customer lifetime value) modeling. At Airbnb, we want to grow our brand and community by growing all users. Our CLV ecosystem applies two frameworks:</p><ol class=""><li id="f688" class="no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol px py pz bj">The value of Airbnb customers. We use traditional ML approaches along with research into more customer-lifecycle-focused architectures (i.e. HMMs). We augment this with demand-supply incrementality modeling to properly account for guest andhost contributions to value.</li><li id="c4d6" class="no np gr nq b nr qa nt nu nv qb nx ny nz qc ob oc od qd of og oh qe oj ok ol px py pz bj">The value growth that Airbnb delivers to customers. By accounting for long-term incremental effects of booking on Airbnb along with incremental contributions from marketing and attribution strategies, we can measure incremental changes in CLV and optimize towards them.</li></ol><p id="bfe8" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Causal inference can also be applied to search. At the CJ workshop, we presented our paper <a class="af om" href="https://airbnb.tech/wp-content/uploads/sites/19/2023/12/CameraReady-7-1.pdf" rel="noopener ugc nofollow" target="_blank">Low Inventory State: Identifying Under-Served Queries for Airbnb Search</a>, which explored the problem of searches that return a low number of results. Whether or not that number is “too low” and will deter a guest from booking depends on search parameters and intent to book. For a given search query, we can use causal inference to determine the incremental effect of an additional result on the probability of booking. Our model outperforms non-causal methods and can assist with supply management as well.</p><p id="1adc" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Finally, our poster discussed how we measure the effects of national TV advertising campaigns. We analyzed TV exposure data and demographic data with data on Airbnb onsite behavior using a third-party identity graph. We were able to resolve disparate datasets to a unique identifier and model individual households.</p><p id="fbf0" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">We use propensity score matching to estimate TV effects, and then scale these estimates to a nationally-representative population. We leverage this data to provide tactical insights for marketing and understand how long TV effects take to decay.</p><figure class="pr ps pt pu pv ni na nb paragraph-image"><div class="na nb qi"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*2550R2exF5NwZquO26vY9A.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*2550R2exF5NwZquO26vY9A.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*2550R2exF5NwZquO26vY9A.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*2550R2exF5NwZquO26vY9A.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*2550R2exF5NwZquO26vY9A.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*2550R2exF5NwZquO26vY9A.png 1100w, https://miro.medium.com/v2/resize:fit:1046/format:webp/1*2550R2exF5NwZquO26vY9A.png 1046w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 523px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*2550R2exF5NwZquO26vY9A.png 640w, https://miro.medium.com/v2/resize:fit:720/1*2550R2exF5NwZquO26vY9A.png 720w, https://miro.medium.com/v2/resize:fit:750/1*2550R2exF5NwZquO26vY9A.png 750w, https://miro.medium.com/v2/resize:fit:786/1*2550R2exF5NwZquO26vY9A.png 786w, https://miro.medium.com/v2/resize:fit:828/1*2550R2exF5NwZquO26vY9A.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*2550R2exF5NwZquO26vY9A.png 1100w, https://miro.medium.com/v2/resize:fit:1046/1*2550R2exF5NwZquO26vY9A.png 1046w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 523px" /></picture></div></figure><p id="3511" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">The plot above (from simulated study for illustration) shows the results of an analysis for a TV campaign from August — October. We can see that the TV campaign was effective at increasing bookings for households that saw an Airbnb TV ad and was more effective for one subgroup (red line) than the other subgroup.</p><h1 id="1bd1" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Data science and analytics infra</h1><p id="029b" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">How can you achieve science at scale in a medium-to-large engineering organization? At the KDD’s <a class="af om" href="https://wamlm-kdd.github.io/wamlm/index.html" rel="noopener ugc nofollow" target="_blank">2nd Workshop on Applied Machine Learning Management</a>, we shared Airbnb’s solution for data science reproducibility and reuse, <a class="af om" href="https://wamlm-kdd.github.io/wamlm/papers/wamlm-kdd23_paper_Daniel_Miller.pdf" rel="noopener ugc nofollow" target="_blank">Onebrain</a>. The core of Onebrain is a coding standard for configuring data science projects entirely in YAML. Onebrain’s backend abstracts away CI/CD, configuration/dependency management, and command-line parsing. Since it’s “just code,” Onebrain projects can be checked into a version-controlled repo, and any repo can be a Onebrain repo.</p><figure class="pr ps pt pu pv ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb qh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*Y0TXGcnp4lz40NVn 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*Y0TXGcnp4lz40NVn 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*Y0TXGcnp4lz40NVn 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*Y0TXGcnp4lz40NVn 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*Y0TXGcnp4lz40NVn 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*Y0TXGcnp4lz40NVn 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*Y0TXGcnp4lz40NVn 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*Y0TXGcnp4lz40NVn 640w, https://miro.medium.com/v2/resize:fit:720/0*Y0TXGcnp4lz40NVn 720w, https://miro.medium.com/v2/resize:fit:750/0*Y0TXGcnp4lz40NVn 750w, https://miro.medium.com/v2/resize:fit:786/0*Y0TXGcnp4lz40NVn 786w, https://miro.medium.com/v2/resize:fit:828/0*Y0TXGcnp4lz40NVn 828w, https://miro.medium.com/v2/resize:fit:1100/0*Y0TXGcnp4lz40NVn 1100w, https://miro.medium.com/v2/resize:fit:1400/0*Y0TXGcnp4lz40NVn 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="f0e9" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">User interaction with Onebrain happens through a CLI. With a single command, anyone can use an existing project as a template for their own work, or generate a one-click URL to spin up a server and run the project. Usage is growing fast with over 200 distinct projects and over 500 users at Airbnb within just a year.</p><p id="8292" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">While most of our research focuses on high-order data use-cases like models, data capture is essential as it’s the starting point for any analysis. Event logging libraries typically capture <em class="pw">actions on</em> and <em class="pw">impressions of</em> app components (buttons, sections, pages). But with this level of granularity, it can be difficult to abstract out user behavior, measure the total time spent on a surface, or understand the context surrounding an action.</p><p id="d0e2" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">At the <a class="af om" href="https://sites.google.com/view/kdd-workshop-2023" rel="noopener ugc nofollow" target="_blank">2nd Workshop on End-End Customer Journey Optimization</a>, we spoke about a new type of client-side event called Sessions. Part of Airbnb’s client-side logging solution, Sessions provide a way to track user context and behaviors within the Airbnb product. Unlike traditional time-based sessions used in web analytics, these Sessions can be tied to various aspects of the Airbnb user experience. For example, they can be tied to specific surfaces like the checkout page, API calls used for observability, or even internal states of the app that abstract away complex UI components. The flexibility of Sessions allows us to capture a wide range of user interactions and better understand their journey throughout our platform.</p><figure class="pr ps pt pu pv ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb qj"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*W0F1AhzPituZ1bfF 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*W0F1AhzPituZ1bfF 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*W0F1AhzPituZ1bfF 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*W0F1AhzPituZ1bfF 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*W0F1AhzPituZ1bfF 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*W0F1AhzPituZ1bfF 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*W0F1AhzPituZ1bfF 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*W0F1AhzPituZ1bfF 640w, https://miro.medium.com/v2/resize:fit:720/0*W0F1AhzPituZ1bfF 720w, https://miro.medium.com/v2/resize:fit:750/0*W0F1AhzPituZ1bfF 750w, https://miro.medium.com/v2/resize:fit:786/0*W0F1AhzPituZ1bfF 786w, https://miro.medium.com/v2/resize:fit:828/0*W0F1AhzPituZ1bfF 828w, https://miro.medium.com/v2/resize:fit:1100/0*W0F1AhzPituZ1bfF 1100w, https://miro.medium.com/v2/resize:fit:1400/0*W0F1AhzPituZ1bfF 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><h1 id="0228" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Conclusion</h1><p id="6d23" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">KDD is an amazing opportunity for data scientists from around the world, and across industry and academia, to come together and exchange learnings and discoveries. We were honored to be invited to share techniques we’ve developed through applied research at Airbnb. The strategies and insights we presented at KDD have been essential to improving Airbnb’s platform, business, and user experience. We’re constantly motivated by innovations happening around us, and we’re thrilled to give back to the community and eager to see what kinds of new applications and advancements may come about as a result.</p><p id="80a2" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">At the bottom of the page, you’ll find a complete list of the talks and papers shared in this article along with the team members who contributed. If you can see yourself on our team, we encourage you to apply for an<a class="af om" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"> open position</a> today.</p><h1 id="ad3f" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">List of papers and talks</h1><p id="fb76" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj"><strong class="nq gs">Optimizing Airbnb Search Journey with Multi-task Learning [</strong><a class="af om" href="https://arxiv.org/abs/2305.18431" rel="noopener ugc nofollow" target="_blank"><strong class="nq gs">link</strong></a><strong class="nq gs">]</strong></p><p id="3752" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Authors: <a class="af om" href="https://arxiv.org/search/cs?searchtype=author&amp;query=Tan%2C+C+H" rel="noopener ugc nofollow" target="_blank">Chun How Tan</a>, <a class="af om" href="https://arxiv.org/search/cs?searchtype=author&amp;query=Chan%2C+A" rel="noopener ugc nofollow" target="_blank">Austin Chan</a>, <a class="af om" href="https://arxiv.org/search/cs?searchtype=author&amp;query=Haldar%2C+M" rel="noopener ugc nofollow" target="_blank">Malay Haldar</a>, <a class="af om" href="https://arxiv.org/search/cs?searchtype=author&amp;query=Tang%2C+J" rel="noopener ugc nofollow" target="_blank">Jie Tang</a>, <a class="af om" href="https://arxiv.org/search/cs?searchtype=author&amp;query=Liu%2C+X" rel="noopener ugc nofollow" target="_blank">Xin Liu</a>, <a class="af om" href="https://arxiv.org/search/cs?searchtype=author&amp;query=Abdool%2C+M" rel="noopener ugc nofollow" target="_blank">Mustafa Abdool</a>, <a class="af om" href="https://arxiv.org/search/cs?searchtype=author&amp;query=Gao%2C+H" rel="noopener ugc nofollow" target="_blank">Huiji Gao</a>, <a class="af om" href="https://arxiv.org/search/cs?searchtype=author&amp;query=He%2C+L" rel="noopener ugc nofollow" target="_blank">Liwei He</a>, <a class="af om" href="https://arxiv.org/search/cs?searchtype=author&amp;query=Katariya%2C+S" rel="noopener ugc nofollow" target="_blank">Sanjeev Katariya</a></p><p id="ff81" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">Variance Reduction Using In-Experiment Data: Efficient and Targeted Online Measurement for Sparse and Delayed Outcomes [</strong><a class="af om" href="https://alexdeng.github.io/public/files/kdd2023-inexp.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="nq gs">link</strong></a><strong class="nq gs">]</strong></p><p id="00b0" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Authors: Alex Deng, Michelle Du, Anna Matlin, Qing Zhang</p><p id="f876" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">Beyond the Simple A/B test: Mitigating Interference Bias at Airbnb</strong></p><p id="0195" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Speaker: Ruben Lobel</p><p id="9b07" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">The Price is Right: Removing A/B Test Bias in a Marketplace of Expirable Goods [</strong><a class="af om" href="https://alexdeng.github.io/public/files/Smart_Pricing_CIKM.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="nq gs">link</strong></a><strong class="nq gs">]</strong></p><p id="8b45" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Author: Thu Le, Alex Deng</p><p id="2a12" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">Unveiling the Guest &amp; Host Journey: Session-Based Instrumentation on Airbnb Platform</strong></p><p id="b5ec" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Speaker: Shant Torosean</p><p id="a26c" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">Devoted to Long-Term Adventure: Growing Airbnb Through Measuring Customer Lifetime Value</strong></p><p id="927c" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Speaker: Sean O’Donell, Jason Cai, Linsha Chen</p><p id="9e0c" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">Low Inventory State: Identifying Under-Served Queries for Airbnb Search [</strong><a class="af om" href="https://airbnb.tech/wp-content/uploads/sites/19/2023/12/CameraReady-7-1.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="nq gs">link</strong></a><strong class="nq gs">]</strong></p><p id="20a1" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Author: Toma Gulea, Bradley Turnbull</p><p id="1bf0" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">Measuring TV Campaigns at Airbnb</strong></p><p id="40c1" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Speaker: Adam Maidman, Sam Barrows</p><p id="025b" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">Tutorial: Data-Centric AI [</strong><a class="af om" href="https://dcaitutorial.github.io/index.html" rel="noopener ugc nofollow" target="_blank"><strong class="nq gs">link</strong></a><strong class="nq gs">]</strong></p><p id="33c9" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Presenter: Daochen Zha, Huiji Gao</p><p id="07a7" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">Hierarchical Clustering As a Novel Solution to the Notorious: Multicollinearity Problem in Observational Causal Inference [</strong><a class="af om" href="https://airbnb.tech/wp-content/uploads/sites/19/2023/12/31.KDD-Paper-Hierarchical-Clustering-As-a-Solution-to-Multicollinearity-%E2%80%93-Marketing-Application-as-an-Example.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="nq gs">link</strong></a><strong class="nq gs">]</strong></p><p id="fb6d" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Authors: Yufei Wu, Zhiying Gu, Alex Deng, Jacob Zhu, Linsha Chen</p><p id="b8b6" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><a class="af om" href="https://wamlm-kdd.github.io/wamlm/papers/wamlm-kdd23_paper_Daniel_Miller.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="nq gs">Onebrain — Microprojects for Data Science</strong></a><strong class="nq gs"> [</strong><a class="af om" href="https://wamlm-kdd.github.io/wamlm/papers/wamlm-kdd23_paper_Daniel_Miller.pdf" rel="noopener ugc nofollow" target="_blank"><strong class="nq gs">link</strong></a><strong class="nq gs">]</strong></p><p id="13d1" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Authors: Daniel Miller, Alex Deng, Narek Amirbekian, Navin Sivanandam, Rodolfo Carboni</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/airbnb-at-kdd-2023-9084ad244d8c</link>
      <guid>https://medium.com/airbnb-engineering/airbnb-at-kdd-2023-9084ad244d8c</guid>
      <pubDate>Fri, 22 Dec 2023 20:59:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Transforming CRM DevOps at Airbnb: A Powerful Framework for Continuous Delivery]]></title>
      <description><![CDATA[<div><div class="hs ht hu hv hw"></div><p id="0597" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">How we championed the CRM CI/CD framework integrating Salesforce DX, GIT, BUILDKITE and Vlocity for an enhanced, efficient and continuous delivery with high software quality.</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz oa"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*P4XeFIHvCDewP429 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*P4XeFIHvCDewP429 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*P4XeFIHvCDewP429 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*P4XeFIHvCDewP429 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*P4XeFIHvCDewP429 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*P4XeFIHvCDewP429 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*P4XeFIHvCDewP429 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*P4XeFIHvCDewP429 640w, https://miro.medium.com/v2/resize:fit:720/0*P4XeFIHvCDewP429 720w, https://miro.medium.com/v2/resize:fit:750/0*P4XeFIHvCDewP429 750w, https://miro.medium.com/v2/resize:fit:786/0*P4XeFIHvCDewP429 786w, https://miro.medium.com/v2/resize:fit:828/0*P4XeFIHvCDewP429 828w, https://miro.medium.com/v2/resize:fit:1100/0*P4XeFIHvCDewP429 1100w, https://miro.medium.com/v2/resize:fit:1400/0*P4XeFIHvCDewP429 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="fa8e" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">By:</strong> <a class="af om" href="https://www.linkedin.com/in/shardakumari/" rel="noopener ugc nofollow" target="_blank">Sharda Kumari</a> <a class="af om" href="https://www.linkedin.com/in/pramodgavade/" rel="noopener ugc nofollow" target="_blank">Pramod Gavade</a></p><h1 id="d796" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Introduction</h1><p id="1693" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">The CRM platform offers a robust suite of functionalities for building scalable applications with minimal reliance on complex coding. However, managing and deploying code and configurations within this ecosystem can be challenging, and the constantly evolving nature of the platform presents an extra layer of complexity. This can lead to slow deployment times, difficulty in balancing code and configuration (e.g. Apex classes and triggers vs. validation rules, page layouts), and managing multiple environments, among other issues.</p><p id="9584" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">To address these challenges, at Airbnb we have developed a resilient DevOps framework specifically tailored to the CRM platform. The framework automates the process of moving code and configuration into production for developers, system administrators in addition to low-code users such as business analysts building dashboards.</p><p id="73e0" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">The Challenges</strong></p><p id="2bfa" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The CRM Platform is a versatile ecosystem with powerful functionalities, but managing code and configurations within it can be challenging. With complex metadata and multiple environments, deploying changes efficiently is difficult. Additionally, the evolving nature of the platform requires a proactive approach.</p><p id="eb28" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">Deployment times</strong>: Deploying changes can be a time-consuming process, especially for extensive applications. This can impact the speed of delivery causing dissatisfaction among developers and stakeholders alike.</p><p id="54f9" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">Balancing code and configuration</strong>: The CRM platform permits the development of applications via both programmatic code (using Apex, Visualforce and Lightning Web Components) and low-code development methodologies (using tools like App Builder and Flow Builder). However, effectively managing a combination of code and configuration based development poses a formidable challenge.</p><p id="b047" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">Managing multiple environments</strong>: CRM developers typically engage in individual environments (sandbox or scratch orgs) to build and test their code prior to moving it to higher-level environments. However, administering multiple environments across different teams can become increasingly intricate and time-consuming.</p><p id="c05f" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">Complexity of file metadata</strong>: CRM is an intricate platform with different types of metadata (including, but not limited to, Apex classes, triggers, Lightning components, flows) that necessitate meticulous management during the development and deployment process.</p><p id="0588" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">Keeping up with changes</strong>: As a cloud-based platform, CRM undergoes frequent changes, with new features and updates being released regularly. Keeping up with these changes and ensuring their non-disruptive integration with existing applications can be a significant challenge.</p><p id="7040" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">The Solution</strong></p><p id="2cec" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">At Airbnb, we have developed a resilient DevOps framework tailored specifically to the CRM platform that integrates Salesforce DX, Git, and Buildkite. Our approach facilitates the inclusion of all stakeholders, including developers to system administrators and low-code users, into the development and deployment process, thereby optimizing the DevOps solution for all personas involved.</p><p id="c8ff" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The CRM DevOps lifecycle comprises the following critical environments that are necessary for the effective deployment of code:</p><ul class=""><li id="f00a" class="na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx pq pr ps bj">Developer(DEV) — The Developers utilize the DEV environment to build and refine the code.</li><li id="b1ba" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Integration(SIT) — SIT ensures seamless integration with other systems.</li><li id="a6d8" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Quality Assurance(QA) — QA verifies the software’s functionality.</li><li id="af36" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Full copy(STAGING) — Staging provides a realistic setting for training and user acceptance.</li><li id="058a" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Pre-release — Pre-release serves as a controlled hosting platform before the code goes live.</li><li id="e1d8" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Hotfix — Hotfix enables swift resolution of urgent production issues.</li><li id="8060" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Prod — Production instance that houses all live traffic and data.</li></ul><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz py"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*3EVe7llwjhhGkiXa 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*3EVe7llwjhhGkiXa 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*3EVe7llwjhhGkiXa 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*3EVe7llwjhhGkiXa 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*3EVe7llwjhhGkiXa 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*3EVe7llwjhhGkiXa 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*3EVe7llwjhhGkiXa 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*3EVe7llwjhhGkiXa 640w, https://miro.medium.com/v2/resize:fit:720/0*3EVe7llwjhhGkiXa 720w, https://miro.medium.com/v2/resize:fit:750/0*3EVe7llwjhhGkiXa 750w, https://miro.medium.com/v2/resize:fit:786/0*3EVe7llwjhhGkiXa 786w, https://miro.medium.com/v2/resize:fit:828/0*3EVe7llwjhhGkiXa 828w, https://miro.medium.com/v2/resize:fit:1100/0*3EVe7llwjhhGkiXa 1100w, https://miro.medium.com/v2/resize:fit:1400/0*3EVe7llwjhhGkiXa 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="1672" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">Each of these environments is associated with a corresponding branch in the Git version control system and is seamlessly connected via a Buildkite DevOps pipeline. After successfully passing through the peer review process by developers, Buildkite jobs are triggered which utilize Salesforce DX to deploy the code into the target sandbox.</p><p id="4c46" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The DevOps journey kicks off with the refreshing of individual developer instances within the CRM from the integration (SIT) environment to import changes made by other developers. Following this, a feature branch is created in Git from the integration branch, facilitating development and unit testing in individual environments. Developers can work efficiently in their own development environments while ensuring seamless integration and collaboration across the board.</p><p id="ae76" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">Once new functionality has been implemented, the development phase is complete. Code undergoes a series of rigorous quality assurance procedures before moving on to the SIT environment. One of the procedures employed is static code analysis, which ensures adherence to coding standards and best practices. Additionally, pull requests are subjected to extensive review and approval procedures to maintain code quality. Before being promoted, the code is finally deployed to the SIT environment for integration testing, where system integrations are validated.</p><p id="6de9" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">After successful testing, changes are promoted to the QA environment for functional and regression testing, which includes automation testing using Provar. QA environment is used for testing a feature end to end between the integrating systems. Automation testing scripts are run in this environment to ensure a bug free product. While code reviews make sure high coding standards are in place, QA and Automation testing ensures a well functioning solution.</p><p id="3245" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The next step in the DevOps journey is the staging environment, where user acceptance testing and performance testing take place. PTPaaS sandbox is used for performance and load testing, which is a pilot offering specifically designed for performance testing. This sandbox is connected to the staging branch to ensure seamless execution of performance testing. With the final steps of validation complete, changes that originated as changes in someone’s feature branch can now be promoted to production and used by everyone.</p><p id="4ec9" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">In contrast to normal feature development, we need to be able to push urgent fixes. This flow takes a different route. Fixes originates in an environment dedicated for Hotfixes which is a replica of Prod (called Hotfix). A fix is worked on in the hotfix environment, tested and validated by QA and then pushed to Prod via Staging. Hotfixes circumvent the SIT environment. Once promoted to production, fixes are then back propagated to the lowest developer sandbox in the chain.</p><p id="116b" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">A Release time block (Deployment window) is determined for the code push and is agreed upon by all stakeholders for the code push. Having a dedicated time for deployment helps with setting expectations with the end users on the new features being released. One of the most significant challenges of CRM deployments is prolonged deployment times, which is further compounded with sizable implementations and longer test executions. Changes can sometimes take upto 90 min to be deployed. To circumvent this challenge, our framework takes a proactive approach by performing a build validation against the Production environment. This validation checks all test class executions and metadata validation in advance, typically more than 24 hours before the deployment window, and enables Quick Deploy in the target Org. During the deployment window, our DevOps automation executes a quick deploy from Buildkite, which helps to detect issues with the build or test runs early on and significantly reduces the actual deployment time. To further minimize deployment times, we have implemented Incremental Deploys, which deploy only the differences between codebases instead of the entire codebase. The previous commit ID is stored in a custom setting in the target org, and Buildkite retrieves it during deployment to commit only the hash set.</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz pz"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*NlvazwPpDt7rxO12 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*NlvazwPpDt7rxO12 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*NlvazwPpDt7rxO12 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*NlvazwPpDt7rxO12 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*NlvazwPpDt7rxO12 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*NlvazwPpDt7rxO12 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*NlvazwPpDt7rxO12 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*NlvazwPpDt7rxO12 640w, https://miro.medium.com/v2/resize:fit:720/0*NlvazwPpDt7rxO12 720w, https://miro.medium.com/v2/resize:fit:750/0*NlvazwPpDt7rxO12 750w, https://miro.medium.com/v2/resize:fit:786/0*NlvazwPpDt7rxO12 786w, https://miro.medium.com/v2/resize:fit:828/0*NlvazwPpDt7rxO12 828w, https://miro.medium.com/v2/resize:fit:1100/0*NlvazwPpDt7rxO12 1100w, https://miro.medium.com/v2/resize:fit:1400/0*NlvazwPpDt7rxO12 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="c5cc" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">The Wins</strong></p><ul class=""><li id="9568" class="na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx pq pr ps bj">Using Quick Deploy reduced the deployment time from an average of 90 mins to 15 mins</li><li id="f427" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Enabled delta deploys instead of full package.</li><li id="3773" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Config and Metadata versioning.</li><li id="c44c" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Built in static code analysis in the repository.</li><li id="f7bd" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Rollback mechanism.</li><li id="ff5b" class="na nb gr nc b nd pt nf ng nh pu nj nk nl pv nn no np pw nr ns nt px nv nw nx pq pr ps bj">Integration test hooks.</li></ul><h1 id="9701" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Conclusion</h1><p id="8eaa" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">Our DevOps implementation has successfully enabled efficient, continuous delivery of high-quality software on the CRM platform. This was achieved by seamlessly integrating Salesforce DX, Git, and Buildkite, thereby establishing a DevOps framework that is optimized for all personas, including developers, admins, and low-code users. As a result, we have witnessed a remarkable decrease in deployment time and a noteworthy improvement in software quality, ultimately leading to the delivery of greater value to our clients.</p><p id="e7df" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><em class="qa">If this type of work interests you, check out some of our </em><a class="af om" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"><em class="qa">related positions.</em></a></p><h1 id="fac0" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Acknowledgments</h1><p id="536d" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">Special mention to <a class="af om" href="mailto:yiming.yan@airbnb.com" rel="noopener ugc nofollow" target="_blank">Yiming Yan</a>, <a class="af om" href="mailto:sameer.miraj@airbnb.com" rel="noopener ugc nofollow" target="_blank">Sameer Miraj</a>, <a class="af om" href="mailto:vasu.rampally@airbnb.com" rel="noopener ugc nofollow" target="_blank">Vasu Rampally</a>,<a class="af om" href="mailto:srinath.therampattil@airbnb.com" rel="noopener ugc nofollow" target="_blank">Srinath Therampattil</a>, <a class="af om" href="mailto:leah.kennedy@airbnb.com" rel="noopener ugc nofollow" target="_blank">Leah Kennedy</a> and the entire team for helping onboard onto this framework. Special thanks to <a class="af om" href="mailto:sudheer.peddineni@airbnb.com" rel="noopener ugc nofollow" target="_blank">Sudheer Peddineni</a> for the guidance.</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/transforming-crm-devops-at-airbnb-a-powerful-framework-for-continuous-delivery-c84a9c18032e</link>
      <guid>https://medium.com/airbnb-engineering/transforming-crm-devops-at-airbnb-a-powerful-framework-for-continuous-delivery-c84a9c18032e</guid>
      <pubDate>Wed, 29 Nov 2023 18:01:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Data Quality Score: The next chapter of data quality at Airbnb]]></title>
      <description><![CDATA[<div><div class="hs ht hu hv hw"></div><figure class="nd ne nf ng nh ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb nc"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*yg9o8VZ22wW_NRiZ 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*yg9o8VZ22wW_NRiZ 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*yg9o8VZ22wW_NRiZ 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*yg9o8VZ22wW_NRiZ 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*yg9o8VZ22wW_NRiZ 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*yg9o8VZ22wW_NRiZ 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*yg9o8VZ22wW_NRiZ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*yg9o8VZ22wW_NRiZ 640w, https://miro.medium.com/v2/resize:fit:720/0*yg9o8VZ22wW_NRiZ 720w, https://miro.medium.com/v2/resize:fit:750/0*yg9o8VZ22wW_NRiZ 750w, https://miro.medium.com/v2/resize:fit:786/0*yg9o8VZ22wW_NRiZ 786w, https://miro.medium.com/v2/resize:fit:828/0*yg9o8VZ22wW_NRiZ 828w, https://miro.medium.com/v2/resize:fit:1100/0*yg9o8VZ22wW_NRiZ 1100w, https://miro.medium.com/v2/resize:fit:1400/0*yg9o8VZ22wW_NRiZ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="72c4" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">By:</strong> <a class="af om" href="https://www.linkedin.com/in/clark-wright/" rel="noopener ugc nofollow" target="_blank">Clark Wright</a></p><h1 id="343c" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Introduction</h1><p id="5d9f" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">These days, as the volume of data collected by companies grows exponentially, we’re all realizing that more data is not always better. In fact, more data, especially if you can’t rely on its quality, can hinder a company by slowing down decision-making or causing poor decisions.</p><p id="c2cc" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">With 1.4 billion cumulative guest arrivals as of year-end 2022, Airbnb’s growth pushed us to an inflection point where diminishing data quality <a class="af om" rel="noopener" href="https://medium.com/airbnb-engineering/data-quality-at-airbnb-e582465f3ef7">began to hinder our data practitioners</a>. Weekly metric reports were difficult to land on time. Seemingly basic metrics like “Active Listings” relied on a web of upstream dependencies. Conducting meaningful data work required significant institutional knowledge to overcome hidden caveats in our data.</p><p id="53ef" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">To meet this challenge, we <a class="af om" rel="noopener" href="https://medium.com/airbnb-engineering/data-quality-at-airbnb-870d03080469">introduced the “Midas” process to certify our data</a>. Starting in 2020, the Midas process, along with the work to re-architect our most critical data models, has brought a dramatic increase in data quality and timeliness to Airbnb’s most critical data. However, achieving the full data quality criteria required by Midas demands significant cross-functional investment to design, develop, validate, and maintain the necessary data assets and documentation.</p><p id="98b2" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">While this made sense for our most critical data, pursuing such rigorous standards at scale presented challenges. We were approaching a point of diminishing returns on our data quality investments. We had certified our most critical assets, restoring their trustworthiness. However, for all of our uncertified data, which remained the majority of our offline data, we lacked visibility into its quality and didn’t have clear mechanisms for up-leveling it.</p><p id="9bf2" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">How could we scale the hard-fought wins and best practices of Midas across our entire data warehouse?</p><p id="63bc" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">In this blog post, we share our innovative approach to scoring data quality, Airbnb’s Data Quality Score (“DQ Score”). We’ll cover how we developed the DQ Score, how it’s being used today, and how it will power the next chapter of data quality at Airbnb.</p><h1 id="b2ce" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Scaling Data Quality</h1><p id="f7e3" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">In 2022, we began exploring ideas for scaling data quality beyond Midas certification. Data producers were requesting a lighter-weight process that could provide some of the quality guardrails of Midas, but with less rigor and time investment. Meanwhile, data consumers continued to fly blind on all data that wasn’t Midas-certified. The brand around Midas-certified data was so strong that consumers started to question whether they should trust any uncertified data. Hesitant to dilute the Midas branding, we wanted to avoid introducing a lightweight version of certification that further stratified our data without truly unlocking long-term scalability.</p><p id="d1ed" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Considering these challenges, we decided to shift to a data quality strategy that pushed the incentives around data quality directly to data producers and consumers. We made the decision that we could <strong class="nq gs">no longer rely on enforcement</strong> to scale data quality at Airbnb, and we instead needed to <strong class="nq gs">rely on incentivization</strong> of both the data producer and consumer.</p><p id="fce5" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">To fully enable this incentivization approach, we believed it would be <strong class="nq gs">paramount to introduce the concept of a data quality score</strong> directly tied to data assets.</p><p id="8e1d" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">We identified the following objectives for the score:</p><ul class=""><li id="83b5" class="no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol pq pr ps bj">Evolve our understanding of data quality beyond a simple binary definition (certified vs uncertified).</li><li id="db3c" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj">Align on the input components for assessing data quality.</li><li id="881f" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj">Enable full visibility into the quality of our offline data warehouse and individual data assets. This visibility should 1) Create natural incentives for producers to improve the quality of the data they own, and 2) Drive demand for high-quality data from data consumers and enable consumers to decide if the quality is appropriate for their needs.</li></ul><h1 id="c942" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Composing the Score</h1><p id="8410" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">Before diving into the nuances of measuring data quality, we drove alignment on the vision by defining our DQ Score guiding principles. With the input of a cross-functional group of data practitioners, we aligned on these guiding principles:</p><ul class=""><li id="9455" class="no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol pq pr ps bj"><strong class="nq gs">Full coverage </strong>— score can be applied to any in-scope data warehouse data asset</li><li id="1baf" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj"><strong class="nq gs">Automated </strong>— collection of inputs that determine the score is 100% automated</li><li id="6afe" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj"><strong class="nq gs">Actionable</strong> — score is easy to discover and actionable for both producers and consumers</li><li id="3975" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj"><strong class="nq gs">Multi-dimensional</strong> — score can be decomposed into pillars of data quality</li><li id="805e" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj"><strong class="nq gs">Evolvable </strong>— scoring criteria and their definitions can change over time</li></ul><p id="dbc1" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">While they may seem simple or obvious, establishing these principles was critical as they guided each decision made in developing the score. Questions that otherwise would have derailed progress were mapped back to our principles.</p><p id="8781" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">For example, our principles were critical in determining which items from our wishlist of scoring criteria should be considered. There were several inputs that certainly could help us measure quality, but if they could not be automatically measured (<strong class="nq gs">Automated</strong>), or if they were so convoluted that data practitioners wouldn’t understand what the criterion meant or how it could be improved upon (<strong class="nq gs">Actionable</strong>), then they were discarded.</p><p id="f068" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">We also had a set of input signals that more directly measure quality (Midas certification, data validation, bugs, SLAs, automated DQ checks, etc.), whereas others were more like proxies for quality (e.g., valid ownership, good governance hygiene, the use of paved path tooling). Were the more explicit and direct measurements of quality more valuable than the proxies?</p><p id="79c3" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Guided by our principles, we eventually settled on having four dimensions of data quality: <strong class="nq gs">Accuracy, Reliability (Timeliness), Stewardship, and Usability</strong>. There were several other possible dimensions that we considered, but these four dimensions were the most meaningful and useful to our data practitioners, and made sense as axes of improvement, where we care and are willing to invest in improving our data along these dimensions.</p><p id="1c9b" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Each dimension could mix implicit and explicit quality indicators, with the key being: Not every data consumer needs to fully understand every individual scoring component, but they’ll understand that a dataset that scores poorly on Reliability and Usability struggles with landing on-time consistently and is difficult to use.</p><p id="672c" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">We could also weigh each dimension according to our perception of its importance in determining quality. We considered 1) how many scoring components belonged to each dimension, 2) enabling quick mental math, and 3) which elements our practitioners care about most to allocate 100 total points across the dimensions:</p><figure class="pz qa qb qc qd ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb py"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*kYGdWRUZl7gBGoNN 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*kYGdWRUZl7gBGoNN 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*kYGdWRUZl7gBGoNN 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*kYGdWRUZl7gBGoNN 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*kYGdWRUZl7gBGoNN 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*kYGdWRUZl7gBGoNN 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*kYGdWRUZl7gBGoNN 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*kYGdWRUZl7gBGoNN 640w, https://miro.medium.com/v2/resize:fit:720/0*kYGdWRUZl7gBGoNN 720w, https://miro.medium.com/v2/resize:fit:750/0*kYGdWRUZl7gBGoNN 750w, https://miro.medium.com/v2/resize:fit:786/0*kYGdWRUZl7gBGoNN 786w, https://miro.medium.com/v2/resize:fit:828/0*kYGdWRUZl7gBGoNN 828w, https://miro.medium.com/v2/resize:fit:1100/0*kYGdWRUZl7gBGoNN 1100w, https://miro.medium.com/v2/resize:fit:1400/0*kYGdWRUZl7gBGoNN 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qe fc qf na nb qg qh be b bf z dt">The “Dimensions of Data Quality” and their weights</figcaption></figure><p id="597e" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Meanwhile, if desired, the dimensions could be unpacked to get to a more detailed view of data quality issues. For example, the Stewardship dimension scores an asset for quality indicators like whether it’s built on our paved path data engineering tools, its governance hygiene, and whether it meets valid data ownership standards.</p><figure class="pz qa qb qc qd ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb qi"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*lHp1LrGFok_7_H0t 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*lHp1LrGFok_7_H0t 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*lHp1LrGFok_7_H0t 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*lHp1LrGFok_7_H0t 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*lHp1LrGFok_7_H0t 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*lHp1LrGFok_7_H0t 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*lHp1LrGFok_7_H0t 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*lHp1LrGFok_7_H0t 640w, https://miro.medium.com/v2/resize:fit:720/0*lHp1LrGFok_7_H0t 720w, https://miro.medium.com/v2/resize:fit:750/0*lHp1LrGFok_7_H0t 750w, https://miro.medium.com/v2/resize:fit:786/0*lHp1LrGFok_7_H0t 786w, https://miro.medium.com/v2/resize:fit:828/0*lHp1LrGFok_7_H0t 828w, https://miro.medium.com/v2/resize:fit:1100/0*lHp1LrGFok_7_H0t 1100w, https://miro.medium.com/v2/resize:fit:1400/0*lHp1LrGFok_7_H0t 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qe fc qf na nb qg qh be b bf z dt">Unpacking the Data Stewardship Dimension</figcaption></figure><h1 id="7dbb" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Presenting the Score to Practitioners</h1><p id="53e8" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">We knew surfacing the DQ Score in an explorable, actionable format was critical to its adoption and success. Furthermore, we had to surface data quality information directly in the venue where data users already discovered and explored data.</p><p id="072d" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Luckily, we had two existing tools that would make this much easier: Dataportal (Airbnb’s data catalog and exploration UI), and the <a class="af om" rel="noopener" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19">Unified Metadata Service</a> (UMS). The score itself is computed in a daily offline data pipeline that collects and transforms various metadata elements from our data systems. The final task of the pipeline uploads the score for each data asset into UMS. By ingesting the DQ Score into UMS, we can surface the score and its components alongside every data asset in Dataportal, the starting point for all data discovery and exploration at Airbnb. All that remained was designing its presentation.</p><p id="b454" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">One of our goals was to surface the concept of quality to data practitioners with varying expertise and needs. Our user base had fully adopted the certified vs uncertified dynamic, but this was the first time we would be presenting the concept of a spectrum of quality, as well as the criteria used to define quality.</p><p id="52da" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">What would be the most interpretable version of a DQ Score? We needed to be able to present a single data quality score that held meaning at quick glance, while also making it possible to explore the score in more detail.</p><p id="7c92" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Our final design presents data quality in three ways, each with a different use case in mind:</p><ol class=""><li id="71f0" class="no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol qj pr ps bj"><strong class="nq gs">A single, high-level score from 0–100</strong>. We assigned categorical thresholds of “Poor”, “Okay”, “Good”, and “Great” based on a profiling analysis of our data warehouse that examined the existing distribution of our DQ score. Best for quick, high-level assessment of a dataset’s overall quality.</li><li id="c412" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol qj pr ps bj"><strong class="nq gs">Dimensional scores</strong>, where an asset can score perfectly on Accuracy but low on Reliability. Useful when a particular area of deficiency is not problematic (e.g., the consumer wants the data to be very accurate but is not worried about it landing quickly every day).</li><li id="772d" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol qj pr ps bj"><strong class="nq gs">Full score detail + Steps to improve, </strong>where data consumers can see exactly where an asset falls short and data producers can take action to improve an asset’s quality.</li></ol><p id="c755" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">All three of these presentations are shown in the screenshots below. The default presentation provides the dimensional scores “Scores per category”, the categorical descriptor of “Poor” along with the 40 points, and steps to improve.</p><figure class="pz qa qb qc qd ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb qk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*Yfqm6IxcFYyFPoqR 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*Yfqm6IxcFYyFPoqR 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*Yfqm6IxcFYyFPoqR 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*Yfqm6IxcFYyFPoqR 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*Yfqm6IxcFYyFPoqR 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*Yfqm6IxcFYyFPoqR 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*Yfqm6IxcFYyFPoqR 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*Yfqm6IxcFYyFPoqR 640w, https://miro.medium.com/v2/resize:fit:720/0*Yfqm6IxcFYyFPoqR 720w, https://miro.medium.com/v2/resize:fit:750/0*Yfqm6IxcFYyFPoqR 750w, https://miro.medium.com/v2/resize:fit:786/0*Yfqm6IxcFYyFPoqR 786w, https://miro.medium.com/v2/resize:fit:828/0*Yfqm6IxcFYyFPoqR 828w, https://miro.medium.com/v2/resize:fit:1100/0*Yfqm6IxcFYyFPoqR 1100w, https://miro.medium.com/v2/resize:fit:1400/0*Yfqm6IxcFYyFPoqR 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qe fc qf na nb qg qh be b bf z dt">Full data quality score page in Dataportal</figcaption></figure><p id="3dc5" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">If a user explores the full score details, they can examine the exact quality shortcomings and view informative tooltips providing more detail on the scoring component’s definition and merit.</p><figure class="pz qa qb qc qd ni na nb paragraph-image"><div role="button" tabindex="0" class="nj nk fg nl bg nm"><div class="na nb ql"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*ZohXNFjX1rzrq5uO 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*ZohXNFjX1rzrq5uO 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*ZohXNFjX1rzrq5uO 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*ZohXNFjX1rzrq5uO 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*ZohXNFjX1rzrq5uO 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*ZohXNFjX1rzrq5uO 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*ZohXNFjX1rzrq5uO 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*ZohXNFjX1rzrq5uO 640w, https://miro.medium.com/v2/resize:fit:720/0*ZohXNFjX1rzrq5uO 720w, https://miro.medium.com/v2/resize:fit:750/0*ZohXNFjX1rzrq5uO 750w, https://miro.medium.com/v2/resize:fit:786/0*ZohXNFjX1rzrq5uO 786w, https://miro.medium.com/v2/resize:fit:828/0*ZohXNFjX1rzrq5uO 828w, https://miro.medium.com/v2/resize:fit:1100/0*ZohXNFjX1rzrq5uO 1100w, https://miro.medium.com/v2/resize:fit:1400/0*ZohXNFjX1rzrq5uO 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="qe fc qf na nb qg qh be b bf z dt">Full score detail presentation</figcaption></figure><h1 id="1309" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">How the Score Is Being Used Today</h1><p id="0e28" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">For <strong class="nq gs">data producers</strong>, the score is providing</p><ul class=""><li id="4f01" class="no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol pq pr ps bj">Clear, actionable steps to improve the DQ of their assets</li><li id="1b90" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj">Quantified DQ, measuring their work</li><li id="c264" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj">Clear expectations around DQ</li><li id="8083" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj">Targets for tech debt clean-up</li></ul><p id="d37d" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">For <strong class="nq gs">data consumers</strong>, the DQ Score</p><ul class=""><li id="18a5" class="no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol pq pr ps bj">Improves data discoverability</li><li id="8c2b" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj">Serves as a signal of trustworthiness for data (just like how the review system works for Airbnb Guests and Hosts)</li><li id="0138" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj">Informs consumers of the exact quality shortcomings so they can be comfortable how they’re using the data</li><li id="6611" class="no np gr nq b nr pt nt nu nv pu nx ny nz pv ob oc od pw of og oh px oj ok ol pq pr ps bj">Enables consumers to seek out and demand data quality</li></ul><p id="3832" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">From a <strong class="nq gs">data strategy</strong> perspective, we are leveraging internal query data combined with the DQ Score to drive DQ efforts across our data warehouse. By considering both the volume and the type of consumption (e.g., whether a particular metric is surfaced in our Executive reporting), we are able to direct data teams to the most impactful data quality improvements. This visibility has been very enlightening for teams who were unaware of their long tail of low-quality assets, and has enabled us to double down on quality investments for heavy-lift data models that power a significant share of our data consumption.</p><p id="1430" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Finally, by developing the DQ Score, we were able to provide uniform guidance to our data producers on producing high-quality, albeit uncertified assets. The DQ Score has not replaced certification (e.g., only Midas-certified data can achieve a DQ Score &gt; 90). We continue to certify our most critical subset of data, and believe the use cases for these assets will always merit the manual validation, rigor, and upkeep of certification. But for everything else, the DQ Score reinforces and scales the principles of Midas across our warehouse.</p><h1 id="5887" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">What’s Next</h1><p id="2a97" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">We’re excited about now being able to measure and observe quantified improvements to our data quality, but we’re just getting started. We recently expanded on the original DQ Score to score our <a class="af om" rel="noopener" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70">Minerva metrics and dimensions</a>. Similarly, we plan to bring the same concept of a DQ Score to other data assets like our event logs and ML features.</p><p id="3f15" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">As the requirements and demands against our data continue to evolve, so will our quality expectations. We’ll continue to evolve how we define and measure quality, and with rapid improvement in areas like <a class="af om" rel="noopener" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19">metadata management and data classification</a>, we anticipate further efficiency and productivity gains for all data practitioners at Airbnb.</p><h1 id="fe39" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Appreciations</h1><p id="f571" class="pw-post-body-paragraph no np gr nq b nr pl nt nu nv pm nx ny nz pn ob oc od po of og oh pp oj ok ol gk bj">The DQ Score would not have been possible without several cross-functional and cross-org collaborators. They include, but are not limited to: <strong class="nq gs">Mark Steinbrick, Chitta Shirolkar, Jonathan Parks, Sylvia Tomiyama, Felix Ouk, Jason Flittner, Ying Pan, Logan George, Woody Zhou, Michelle Thomas, </strong>and<strong class="nq gs"> Erik Ritter.</strong></p><p id="a9cd" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj">Special thanks to the broader Airbnb data community members who provided input or aid to the implementation team throughout the design, development, and launch phases.</p><p id="e911" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><em class="qm">If this type of work interests you, check out some of our </em><a class="af om" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"><em class="qm">related positions.</em></a></p><p id="5abd" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><strong class="nq gs">****************</strong></p><p id="52a1" class="pw-post-body-paragraph no np gr nq b nr ns nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol gk bj"><em class="qm">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/data-quality-score-the-next-chapter-of-data-quality-at-airbnb-851dccda19c3</link>
      <guid>https://medium.com/airbnb-engineering/data-quality-score-the-next-chapter-of-data-quality-at-airbnb-851dccda19c3</guid>
      <pubDate>Tue, 28 Nov 2023 20:37:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Wisdom of Unstructured Data: Building Airbnb’s Listing Knowledge from Big Text Data]]></title>
      <description><![CDATA[<div><div class="hs ht hu hv hw"></div><p id="1736" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">How Airbnb leverages ML/NLP to extract useful information about listings from unstructured text data to power personalized experiences for guests.</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz oa"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*lIyANhCNPb8mYJe7XM0Grw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div></figure><p id="9256" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj"><strong class="nc gs">By:</strong> <a class="af om" href="http://linkedin.com/in/hwlical" rel="noopener ugc nofollow" target="_blank">Hongwei Li</a> and <a class="af om" href="https://www.linkedin.com/in/peng-wang-13117371/" rel="noopener ugc nofollow" target="_blank">Peng Wang</a></p><h1 id="9c45" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Introduction</h1><p id="ac22" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">At Airbnb, it’s important for us to gather structured data about listings and better understand the data, so we can help Hosts provide great experiences for guests. For example, guests who work remotely need to know if a listing has a suitable workspace and reliable internet, while guests with children might need items like highchairs and cribs. However, not all listings clearly display these attributes, causing there to be a mismatch between what Hosts listings have and what guests are looking for.</p><p id="f05c" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">This is just one of many examples of how we can use the unstructured data generated on our platform, including text data that has undergone anonymization steps from various text-based guest interactions with the platform, to extract useful structure data. Instead of relying on Hosts to manually input all the potential listing attributes, which would be tedious given the vast number of attributes guests care and inquire about, we developed a machine learning system called <strong class="nc gs">L</strong>isting <strong class="nc gs">A</strong>ttribute <strong class="nc gs">E</strong>xtraction <strong class="nc gs">P</strong>latform (LAEP) for extracting the structure data at scale. Note that the original name of the project is called LATEX (<strong class="nc gs">L</strong>isting <strong class="nc gs">AT</strong>tribute <strong class="nc gs">EX</strong>traction) and it is cited in our <a class="af om" rel="noopener" href="https://medium.com/airbnb-engineering/prioritizing-home-attributes-based-on-guest-interest-3c49b827e51a">previous tech blog</a>. We have since renamed the project to LAEP.</p><p id="f24e" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">LAEP automatically extracts structured information, such as listing attributes, directly from the unstructured text data we mentioned above. The attributes collected by LAEP are then integrated into various applications, building Airbnb’s Listing Knowledge. It powers downstream tools like the <a class="af om" rel="noopener" href="https://medium.com/airbnb-engineering/prioritizing-home-attributes-based-on-guest-interest-3c49b827e51a">Attribute Prioritization System</a> (APS) and listing attribute collection system (Eve).</p><p id="7107" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">LAEP doesn’t just extract listing attributes, it has the ability to detect different types of entities, such as activities, hospitalities, and points of interest (POI) like famous landmarks. This opens up possibilities for supporting a wide range of product applications. For example, hospitality data can help guests get personalized services during the stay while activity data can help identify and create new categories that guests love.</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz pq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*x4j90Y2ExQv9Hoac 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*x4j90Y2ExQv9Hoac 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*x4j90Y2ExQv9Hoac 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*x4j90Y2ExQv9Hoac 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*x4j90Y2ExQv9Hoac 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*x4j90Y2ExQv9Hoac 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*x4j90Y2ExQv9Hoac 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*x4j90Y2ExQv9Hoac 640w, https://miro.medium.com/v2/resize:fit:720/0*x4j90Y2ExQv9Hoac 720w, https://miro.medium.com/v2/resize:fit:750/0*x4j90Y2ExQv9Hoac 750w, https://miro.medium.com/v2/resize:fit:786/0*x4j90Y2ExQv9Hoac 786w, https://miro.medium.com/v2/resize:fit:828/0*x4j90Y2ExQv9Hoac 828w, https://miro.medium.com/v2/resize:fit:1100/0*x4j90Y2ExQv9Hoac 1100w, https://miro.medium.com/v2/resize:fit:1400/0*x4j90Y2ExQv9Hoac 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fc ps ny nz pt pu be b bf z dt">Figure 1. An illustration of the process of from LAEP to downstream applications such as listing attribute collection system (Eve) and attribute prioritization system (APS), then feeds into Structure Data Catalog.</figcaption></figure><p id="a28b" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">Prior to LAEP, Airbnb had multiple ways to collect structured information for listings, including the Listing Editors page for Hosts, the Supplementary Review Flow (SRF) for guests, and partnering with third-party vendors. However, these approaches faced several challenges and limitations. For instance, Airbnb minimized the impression of SRF questions in the standard review flow to boost the guest review experience, resulting in reduced data intake from the guest side. Consequently, there has been a growing need to extract listing information from unstructured text data, and LAEP was developed to address the aforementioned issues by automating this data collection process.</p><p id="af14" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The LAEP technology gathers and analyzes anonymous and unstructured text data, enabling many potential applications that can enhance the Airbnb experience for both Hosts and guests.</p><h1 id="d917" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">LAEP Implementation</h1><p id="2134" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">There are three main components in LAEP:</p><ol class=""><li id="1414" class="na nb gr nc b nd ne nf ng nh ni nj nk nl pv nn no np pw nr ns nt px nv nw nx py pz qa bj"><strong class="nc gs">Named Entity Recognition (NER)</strong>: This component identifies and classifies specific phrases or entities in free text into predefined categories like amenities, places of interest, and facilities. For example, from various sources the phrase “<em class="qb">swimming pool</em>” would be detected as an entity with the type “<em class="qb">Amenity</em>”.</li><li id="9b8a" class="na nb gr nc b nd qc nf ng nh qd nj nk nl qe nn no np qf nr ns nt qg nv nw nx py pz qa bj"><strong class="nc gs">Entity Mapping (EM</strong>): Once an entity is detected, EM maps it to standard listing attributes stored in Airbnb’s attribute database (Taxonomy). This allows LAEP to create a comprehensive catalog of Airbnb listings by associating detected entities with their corresponding attributes.</li><li id="6633" class="na nb gr nc b nd qc nf ng nh qd nj nk nl qe nn no np qf nr ns nt qg nv nw nx py pz qa bj"><strong class="nc gs">Entity Scoring (ES)</strong>: ES determines the presence of a detected phrase within a listing. It infers whether the attribute mentioned actually exists in the associated listing and provides confidence level.</li></ol><p id="e978" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">Below is an illustration of the components within LAEP is as follows:</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz qh"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*KQAyx7-8QevyP29Q1aaSoQ.png 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*KQAyx7-8QevyP29Q1aaSoQ.png 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*KQAyx7-8QevyP29Q1aaSoQ.png 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*KQAyx7-8QevyP29Q1aaSoQ.png 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*KQAyx7-8QevyP29Q1aaSoQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*KQAyx7-8QevyP29Q1aaSoQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*KQAyx7-8QevyP29Q1aaSoQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*KQAyx7-8QevyP29Q1aaSoQ.png 640w, https://miro.medium.com/v2/resize:fit:720/1*KQAyx7-8QevyP29Q1aaSoQ.png 720w, https://miro.medium.com/v2/resize:fit:750/1*KQAyx7-8QevyP29Q1aaSoQ.png 750w, https://miro.medium.com/v2/resize:fit:786/1*KQAyx7-8QevyP29Q1aaSoQ.png 786w, https://miro.medium.com/v2/resize:fit:828/1*KQAyx7-8QevyP29Q1aaSoQ.png 828w, https://miro.medium.com/v2/resize:fit:1100/1*KQAyx7-8QevyP29Q1aaSoQ.png 1100w, https://miro.medium.com/v2/resize:fit:1400/1*KQAyx7-8QevyP29Q1aaSoQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fc ps ny nz pt pu be b bf z dt">Figure 2. The scope of LAEP includes three main components: Named Entity Recognition, Entity Mapping and Entity Scoring.</figcaption></figure><h1 id="bf63" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Named Entity Recognition</h1><p id="63f8" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">There are many off-the-shelf pretrained NER models that can extract general entity categories, but none of them fully supports Airbnb’s use cases. Therefore, we built our own NER models to detect and extract predefined entities important to Airbnb business from free text.</p><p id="61a5" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The NER model defines five types of entities (<strong class="nc gs">Amenity</strong>, <strong class="nc gs">Facility</strong>, <strong class="nc gs">Hospitality</strong>, <strong class="nc gs">Location features</strong>, and <strong class="nc gs">Structural details</strong>) that are important to Airbnb. We sampled and labeled 30K example texts from six channels, then trained the NER model. For current product use cases, we apply the language detection module and filter out English text only. In the future we may build the multilingual Transformer based NER model to handle non-English content. Text is then split into tokens. NER mode localates entity span, and classifies entity labels by using a convolutional neural network (CNN) framework. The output is a list of detected named entities, in the format of tuples <em class="qb">&lt;entity label, start index, end index&gt;</em>. Combining all components together, the NER pipeline is shown in figure 3.</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz pq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*ap6ezN-TVx2yXr-H 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*ap6ezN-TVx2yXr-H 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*ap6ezN-TVx2yXr-H 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*ap6ezN-TVx2yXr-H 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*ap6ezN-TVx2yXr-H 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*ap6ezN-TVx2yXr-H 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*ap6ezN-TVx2yXr-H 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*ap6ezN-TVx2yXr-H 640w, https://miro.medium.com/v2/resize:fit:720/0*ap6ezN-TVx2yXr-H 720w, https://miro.medium.com/v2/resize:fit:750/0*ap6ezN-TVx2yXr-H 750w, https://miro.medium.com/v2/resize:fit:786/0*ap6ezN-TVx2yXr-H 786w, https://miro.medium.com/v2/resize:fit:828/0*ap6ezN-TVx2yXr-H 828w, https://miro.medium.com/v2/resize:fit:1100/0*ap6ezN-TVx2yXr-H 1100w, https://miro.medium.com/v2/resize:fit:1400/0*ap6ezN-TVx2yXr-H 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fc ps ny nz pt pu be b bf z dt">Figure 3. The overview of NER pipeline and the functional components</figcaption></figure><p id="30ab" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">Figure 4 shows an example output from the pipeline with detected entities highlighted.</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz qi"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*iauFv6-5wdocvx4m 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*iauFv6-5wdocvx4m 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*iauFv6-5wdocvx4m 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*iauFv6-5wdocvx4m 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*iauFv6-5wdocvx4m 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*iauFv6-5wdocvx4m 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*iauFv6-5wdocvx4m 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*iauFv6-5wdocvx4m 640w, https://miro.medium.com/v2/resize:fit:720/0*iauFv6-5wdocvx4m 720w, https://miro.medium.com/v2/resize:fit:750/0*iauFv6-5wdocvx4m 750w, https://miro.medium.com/v2/resize:fit:786/0*iauFv6-5wdocvx4m 786w, https://miro.medium.com/v2/resize:fit:828/0*iauFv6-5wdocvx4m 828w, https://miro.medium.com/v2/resize:fit:1100/0*iauFv6-5wdocvx4m 1100w, https://miro.medium.com/v2/resize:fit:1400/0*iauFv6-5wdocvx4m 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fc ps ny nz pt pu be b bf z dt">Figure 4. Example output from the NER pipeline. The detected entities are highlighted and each entity category is marked with a different color.</figcaption></figure><p id="8ca5" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The labeled dataset was randomly split into training and testing datasets with a 9:1 ratio. After training completes, we evaluate the model performance on the testing dataset across all text channels. The evaluation criteria uses Strict Match which requires correctly identifying the boundary and category of the entity, simultaneously. The model overall performance and each category’s performance are in figure 5.</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz qj"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*hSaL7fXAgmvy2S0B 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*hSaL7fXAgmvy2S0B 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*hSaL7fXAgmvy2S0B 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*hSaL7fXAgmvy2S0B 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*hSaL7fXAgmvy2S0B 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*hSaL7fXAgmvy2S0B 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*hSaL7fXAgmvy2S0B 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*hSaL7fXAgmvy2S0B 640w, https://miro.medium.com/v2/resize:fit:720/0*hSaL7fXAgmvy2S0B 720w, https://miro.medium.com/v2/resize:fit:750/0*hSaL7fXAgmvy2S0B 750w, https://miro.medium.com/v2/resize:fit:786/0*hSaL7fXAgmvy2S0B 786w, https://miro.medium.com/v2/resize:fit:828/0*hSaL7fXAgmvy2S0B 828w, https://miro.medium.com/v2/resize:fit:1100/0*hSaL7fXAgmvy2S0B 1100w, https://miro.medium.com/v2/resize:fit:1400/0*hSaL7fXAgmvy2S0B 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fc ps ny nz pt pu be b bf z dt">Figure 5. Example performance metrics (Precision, Recall, F1 scores) for NER model</figcaption></figure><h1 id="f73e" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Entity Mapping</h1><p id="5697" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">There are many different ways for people to talk about the same thing. For instance, we found over twelve variations for the attribute “lockbox,” such as <em class="qb">lock box, lock-box, box for the key, </em>and<em class="qb"> keybox</em>. Typos like “<em class="qb">ket box</em>” are also common due to input from error-prone mobile devices. Therefore, we need to map different variations of named-entities to the standard entity name as defined by the standard taxonomy for downstream applications.</p><p id="d1d6" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">With hundreds of listing attributes but millions of detected phrases in a year, many phrases map to the same attribute (like “lockbox”) while others have no mapping. To address this, we introduce confidence levels for mappings, allowing us to establish rules for cases where mapping cannot be done. A confidence value between 0 and 1 is assigned, and if no mappings exceed the confidence threshold, it is marked as “No Mapping.”</p><p id="3e69" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">Labeling these mappings becomes challenging when dealing with numerous unique phrases and potential attributes. Typically, labeling involves comparing the semantic similarity between the phrase and each of the 800+ attribute names. To overcome this, we started with unsupervised learning methods to tackle the problem instead of using the supervised learning methods to save significant labeling efforts.</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz qk"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*eboslixlFXbBZrJ_ 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*eboslixlFXbBZrJ_ 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*eboslixlFXbBZrJ_ 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*eboslixlFXbBZrJ_ 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*eboslixlFXbBZrJ_ 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*eboslixlFXbBZrJ_ 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*eboslixlFXbBZrJ_ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*eboslixlFXbBZrJ_ 640w, https://miro.medium.com/v2/resize:fit:720/0*eboslixlFXbBZrJ_ 720w, https://miro.medium.com/v2/resize:fit:750/0*eboslixlFXbBZrJ_ 750w, https://miro.medium.com/v2/resize:fit:786/0*eboslixlFXbBZrJ_ 786w, https://miro.medium.com/v2/resize:fit:828/0*eboslixlFXbBZrJ_ 828w, https://miro.medium.com/v2/resize:fit:1100/0*eboslixlFXbBZrJ_ 1100w, https://miro.medium.com/v2/resize:fit:1400/0*eboslixlFXbBZrJ_ 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fc ps ny nz pt pu be b bf z dt">Figure 6. Entity Mapping: map detected NER phrases from free text to predefined listing attributes.</figcaption></figure><p id="944e" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">In LAEP, the entity mapping approach involves the following steps:</p><ol class=""><li id="8be9" class="na nb gr nc b nd ne nf ng nh ni nj nk nl pv nn no np pw nr ns nt px nv nw nx py pz qa bj"><strong class="nc gs">Preprocessing</strong>: Both the listing attributes and detected phrases undergo preprocessing techniques such as lowercasing and lemmatization to eliminate unnecessary word variations.</li><li id="d915" class="na nb gr nc b nd qc nf ng nh qd nj nk nl qe nn no np qf nr ns nt qg nv nw nx py pz qa bj"><strong class="nc gs">Mapping to Word Embeddings</strong>: All standard listing attributes are mapped to the word-embedding space using a <a class="af om" href="https://arxiv.org/abs/1301.3781" rel="noopener ugc nofollow" target="_blank">word2vec</a> model fine-tuned with Airbnb’s text data.</li><li id="1c32" class="na nb gr nc b nd qc nf ng nh qd nj nk nl qe nn no np qf nr ns nt qg nv nw nx py pz qa bj"><strong class="nc gs">Finding Closest Attribute</strong>: For a preprocessed detected phrase, the closest listing attribute is determined based on cosine similarity in the word-embedding space. The similarity score serves as the confidence score for the mapping.</li></ol><p id="c193" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">As the example in the figure above, the word “Lock-box” is mapped to the embedding space of listing attributes and compared with each attribute. The closest match is found with the attribute “lockbox,” which is identified as the top mapping.</p><h1 id="38ab" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Entity Scoring</h1><p id="47bc" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">After mapping a detected phrase to a standard listing attribute, it’s important to infer metadata about the attribute, such as its existence, usability, and local sentiment. Among these, attribute presence is crucial for the guest experience, especially for the example of amenities like “crib” or “highchair” for guests with infants.</p><p id="9e4c" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The presence model in LAEP determines if the mapped attribute exists in the listing by performing local text classification. It provides a discrete output (YES, Unknown, NO) indicating attribute presence, accompanied by a confidence score reflecting the level of confidence in the inference.</p><p id="4d4b" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The label classes are {YES, Unknown, NO}, where Yes means the attribute is present, NO means it’s not present, and Unknown accounts for cases where presence is hard to determine from the text alone (e.g., amenity not present).</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz pq"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*ReStnDPKFX8uofqR 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*ReStnDPKFX8uofqR 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*ReStnDPKFX8uofqR 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*ReStnDPKFX8uofqR 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*ReStnDPKFX8uofqR 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*ReStnDPKFX8uofqR 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*ReStnDPKFX8uofqR 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*ReStnDPKFX8uofqR 640w, https://miro.medium.com/v2/resize:fit:720/0*ReStnDPKFX8uofqR 720w, https://miro.medium.com/v2/resize:fit:750/0*ReStnDPKFX8uofqR 750w, https://miro.medium.com/v2/resize:fit:786/0*ReStnDPKFX8uofqR 786w, https://miro.medium.com/v2/resize:fit:828/0*ReStnDPKFX8uofqR 828w, https://miro.medium.com/v2/resize:fit:1100/0*ReStnDPKFX8uofqR 1100w, https://miro.medium.com/v2/resize:fit:1400/0*ReStnDPKFX8uofqR 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fc ps ny nz pt pu be b bf z dt">Figure 7. Illustration of entity scoring for the meta info about certain entities of interest.</figcaption></figure><p id="5498" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">To build this text classification model, the ES component employs a fine-tuned <a class="af om" href="https://arxiv.org/abs/1810.04805" rel="noopener ugc nofollow" target="_blank">BERT</a> model. It analyzes source data, including detected phrases and their local context, to infer attribute existence. The output can then be used in the APS and Eve system to provide recommendations to Hosts, merchandize existing home attributes, or clarify popular listing facilities..</p><figure class="ob oc od oe of og ny nz paragraph-image"><div role="button" tabindex="0" class="oh oi fg oj bg ok"><div class="ny nz ql"><picture><source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/0*2LTvSo5iksvBDf8R 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/0*2LTvSo5iksvBDf8R 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/0*2LTvSo5iksvBDf8R 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/0*2LTvSo5iksvBDf8R 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/0*2LTvSo5iksvBDf8R 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/0*2LTvSo5iksvBDf8R 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/0*2LTvSo5iksvBDf8R 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp" /><source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/0*2LTvSo5iksvBDf8R 640w, https://miro.medium.com/v2/resize:fit:720/0*2LTvSo5iksvBDf8R 720w, https://miro.medium.com/v2/resize:fit:750/0*2LTvSo5iksvBDf8R 750w, https://miro.medium.com/v2/resize:fit:786/0*2LTvSo5iksvBDf8R 786w, https://miro.medium.com/v2/resize:fit:828/0*2LTvSo5iksvBDf8R 828w, https://miro.medium.com/v2/resize:fit:1100/0*2LTvSo5iksvBDf8R 1100w, https://miro.medium.com/v2/resize:fit:1400/0*2LTvSo5iksvBDf8R 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" /></picture></div></div><figcaption class="pr fc ps ny nz pt pu be b bf z dt">Figure 8. Architecture of Presence Score Model. (Revised based on courtesy from <a class="af om" href="https://www.researchgate.net/publication/344901824_ProBERT_Product_Data_Classification_with_Fine-tuning_BERT_Model" rel="noopener ugc nofollow" target="_blank">Zahera and Sherif et al.</a>)</figcaption></figure><p id="7a2c" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">The model architecture (Figure 8) utilizes a pre-trained BERT model with text data from six different sources. The input text is truncated to a maximum length of 512 tokens. Empirical studies suggest that using 65 words around the detected phrase (32 before and 32 after), achieves the best result. The embeddings from the [CLS] token are passed through a fully connected layer, dropout layer, and ReLU linear projection layer to generate a probabilistic vector over the label classes.</p><h1 id="f8c4" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Conclusion</h1><p id="df98" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">In this post, we introduced an end-to-end structural information extraction system within Airbnb, LAEP, to detect phrases of interest from various text data sources, map them into standard listing attribute taxonomies, and then infer the meta information of the attributes from the contextual information in the texts while also having privacy by design controls with the objective to not process personal information. LAEP is applied in downstream applications like <a class="af om" rel="noopener" href="https://medium.com/airbnb-engineering/prioritizing-home-attributes-based-on-guest-interest-3c49b827e51a">APS</a>, and can be leveraged to help our teams find new categories of listings and discover new listing attributes that matter to guests. It helps us to understand Airbnb’s listing better with scale and can power future applications to continue improving the experience of both our Hosts and guests.</p><p id="e3cb" class="pw-post-body-paragraph na nb gr nc b nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv nw nx gk bj">If this type of work interests you, check out some of our related positions at <a class="af om" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">Careers at Airbnb</a>!</p><h1 id="2e34" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">Acknowledgments</h1><p id="8d3d" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj">We would like to thank all the people who supported this project — <strong class="nc gs">Qianru Ma, Joy Jing, Xiao Li, Brennan Polley,</strong> <strong class="nc gs">Paolo Massimi,</strong> <strong class="nc gs">Dean Chen, Guillaume Guy, Lianghao Li, Mia Zhao, Joy Zhang, Usman Abbasi, Pavan Tapadia,</strong> <strong class="nc gs">Jing Xia</strong>, <strong class="nc gs">Maggie Jarley </strong>and more. Special thanks to<strong class="nc gs"> Ben Mendeler, Shaowei Su, Alfredo Luque and Tianxiang Chen</strong> from the ML-infra team for their generous support and help.</p><h1 id="4a70" class="on oo gr be op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph pi pj pk bj">****************</h1><p id="2aae" class="pw-post-body-paragraph na nb gr nc b nd pl nf ng nh pm nj nk nl pn nn no np po nr ns nt pp nv nw nx gk bj"><em class="qb">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/wisdom-of-unstructured-data-building-airbnbs-listing-knowledge-from-big-text-data-7c533466a63c</link>
      <guid>https://medium.com/airbnb-engineering/wisdom-of-unstructured-data-building-airbnbs-listing-knowledge-from-big-text-data-7c533466a63c</guid>
      <pubDate>Wed, 15 Nov 2023 18:02:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey to Airbnb — Helena Zarazua]]></title>
      <description><![CDATA[<article><div class="l"><div class="l"><section><div><div class="gk gl gm gn go"><div class="ab ca"><div class="ch bg fw fx fy fz"><div><div class="hs ht hu hv hw"><div class="speechify-ignore ab co"><div class="speechify-ignore bg l"><div class="hx hy hz ia ib ab"><div><div class="ab ic"><a rel="noopener follow" href="https://medium.com/@lauren.mackevich"><div><div class="bl" aria-hidden="false"><div class="l id ie bx if ig"><div class="l fg"><img alt="Lauren Mackevich" class="l fa bx dc dd cw" src="https://miro.medium.com/v2/resize:fill:88:88/0*-imhApAGWwgM89i1.jpg" width="44" height="44" data-testid="authorPhoto" /></div></div></div></div></a><a href="https://medium.com/airbnb-engineering" rel="noopener follow"><div class="ij ab fg"><div><div class="bl" aria-hidden="false"><div class="l ik il bx if im"><div class="l fg"><img alt="The Airbnb Tech Blog" class="l fa bx bq in cw" src="https://miro.medium.com/v2/resize:fill:48:48/1*MlNQKg-sieBGW5prWoe9HQ.jpeg" width="24" height="24" data-testid="publicationPhoto" /></div></div></div></div></div></a></div></div><div class="bm bg l"><div class="ab"><div><div class="io ab q"><div class="ab q ip"><div class="ab q"><div><div class="bl" aria-hidden="false"><p class="be b iq ir bj"><a class="af ag ah ai aj ak al am an ao ap aq ar is" data-testid="authorName" rel="noopener follow" href="https://medium.com/@lauren.mackevich">Lauren Mackevich</a></p></div></div></div>·<p class="be b iq ir dt"><a class="iv iw ah ai aj ak al am an ao ap aq ar eu ix iy" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2Fae9de0d76057&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-helena-zarazua-3b0aad94a04a&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057">Follow</a></p></div></div></div></div><div class="l iz"><div class="ab cm ja jb jc"><div class="jd je ab"><div class="be b bf z dt ab jf">Published in<div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" data-testid="publicationName" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b bf z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div><div class="h k">·</div></div><div class="ab ae">6 min read<div class="jp jq l" aria-hidden="true">·</div>Just now</div></div></div></div></div><div class="ab co jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg"><div class="h k w fd fe q"><div class="kw l"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" data-testid="headerClapButton" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F3b0aad94a04a&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-helena-zarazua-3b0aad94a04a&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div></div><div><div class="bl" aria-hidden="false"></div></div><div class="ab q kh ki kj kk kl km kn ko kp kq kr ks kt ku kv"><div class="h k"><div><div class="bl" aria-hidden="false"></div></div><div class="fa sb cm"><div class="l ae"><div class="ab ca"><div class="sc sd se sf sg oe ch bg"><div class="ab"><div class="bl bg" aria-hidden="false"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div><div class="bl" aria-hidden="false" aria-describedby="postFooterSocialMenu" aria-labelledby="postFooterSocialMenu"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div><p id="a44d" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">Growing from engineering apprentice to seasoned iOS developer</p><figure class="nu nv nw nx ny nz nr ns paragraph-image"><div role="button" tabindex="0" class="oa ob fg oc bg od"><div class="nr ns nt"><picture></picture></div></div></figure><p id="c851" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj"><em class="og">Languages have always come naturally to </em><a class="af oh" href="https://www.linkedin.com/in/helena-zarazua-50b7a3127/" rel="noopener ugc nofollow" target="_blank"><em class="og">Helena Zarazua</em></a><em class="og">, who has used this skill to bring people together, whether by teaching English to Chinese businesspeople or by immersing American preschoolers in Spanish. Since then, Helena joined Airbnb through the Connect engineering apprenticeship program and has stayed on as a full-time engineer. She’s picked up new (programming) languages like Swift to specialize in iOS development, and works on features to create a world where anyone can belong anywhere.</em></p><p id="1881" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj"><em class="og">Read on to hear Helena’s story. from none other than Helena herself.</em></p><h1 id="9d6a" class="oi oj gr be ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf bj">And… action! Becoming my own main character</h1><p id="887a" class="pw-post-body-paragraph mt mu gr mv b mw pg my mz na ph nc nd ne pi ng nh ni pj nk nl nm pk no np nq gk bj">I’ve always dreamt big, but not once did my dreams include software engineering. Despite having an aptitude for mathematics since young, my aspirations emphasized my artistic side — I wanted to be a singer, actress, or anyone involved in film production. Even though (or maybe precisely because) my father was a computer engineer, I had a preconceived notion that I just wasn’t made for the technology industry.</p><p id="8aa9" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">That turned out to be completely false, but it’s certainly been a journey to reach where I am today.</p><p id="f67b" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">Growing up in Mexico City, I was firmly on the path to pursuing a career in the arts. I attended an arts high school where I focused on stage acting, producing, and directing, all experiences for which I’m very grateful. I studied other subjects, too, and I had one very inspirational math teacher who helped me see the philosophical side of math, but my plan after high school was to go even deeper into film.</p><p id="c2ae" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">Although I received admission to film school, ultimately, it didn’t make financial sense to attend that particular program in the US. Instead, I used this as an opportunity to take a sabbatical and explore my other talents and passions in life. In addition to my knack for math, I realized that languages are another strength of mine — I reached conversational fluency in German just because I once dated somebody from Germany!</p><p id="ca03" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">When I found a remote job opportunity to teach business English with a Chinese company, I used it as a chance to travel the world over a three-year period. It’s quite a nice coincidence that now at Airbnb, I get to help millions of people do the same.</p><p id="9df4" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">My travels brought me to San Francisco, where I decided to settle down. I changed jobs a couple times to try other roles, including one as a teaching assistant at a Spanish immersion daycare. The culture of technology runs high in the Bay Area, though, and some encouragement from my partner at the time made me consider something that was never before on my radar — coding bootcamp.</p><h1 id="2ae9" class="oi oj gr be ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf bj">From bootcamp to Airbnb</h1><p id="bfc3" class="pw-post-body-paragraph mt mu gr mv b mw pg my mz na ph nc nd ne pi ng nh ni pj nk nl nm pk no np nq gk bj">I used to think I wasn’t technical enough to be an engineer, even though I was always good at math. I felt like there was a certain type of person who went into software and that just wasn’t me. But I also felt like I had nothing to lose, so I applied for a software engineering bootcamp in San Francisco. Perhaps a sign that I could succeed in engineering after all, I aced the interview and officially enrolled in the program.</p><p id="f3f4" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">A big part of my journey that I’ve yet to talk about is mental health. Throughout my life, I’ve struggled with ADHD and OCD, in addition to clinical depression and anxiety. The intersection of these has been especially challenging, and I’m always cognizant of how to balance my mental health with my career. Bootcamp, as the name suggests, was an extremely intense 16 weeks, not to mention the added stress of the pandemic starting at that time.</p><p id="3fe9" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">Given my ADHD, I prefer to work and learn in person, so the shift to a remote bootcamp wasn’t easy. That said, the support network at my bootcamp was really strong, particularly the program’s mentorship and coaching. I put a lot of time and effort into my job search, which was made easier thanks to the help of my advisors.</p><p id="cd2f" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">My first contact with Airbnb was through a panel discussion. Hearing from people at Airbnb with non-traditional backgrounds like myself was really reassuring. They emphasized just how accessible and friendly the culture at Airbnb is. It sounded like a great environment that truly values diversity.</p><p id="e8c0" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">Much like my foray into the bootcamp, I thought at the very least I’d try applying to Airbnb. The interview process itself reinforced my perception of the company’s warmth. I genuinely enjoyed my interpersonal interview where I got the sense that the interviewer was making an effort to get to know me as a human being. Personally, I felt that Airbnb’s technical interviews weren’t as intimidating as I expected, and the vibe was always constructive.</p><h1 id="e9cb" class="oi oj gr be ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf bj">A paved road to full-time: The Airbnb Connect apprenticeship</h1><p id="cffc" class="pw-post-body-paragraph mt mu gr mv b mw pg my mz na ph nc nd ne pi ng nh ni pj nk nl nm pk no np nq gk bj">Needless to say, the interviews went well and I joined the second cohort of <a class="af oh" rel="noopener" href="https://medium.com/airbnb-engineering/inside-connect-airbnbs-engineering-apprenticeship-program-c26d6eb2768c">Airbnb’s Connect program</a>! Connect is a six-month engineering apprenticeship designed to attract non-traditional candidates who would be successful at Airbnb. Structured in two parts, Connect starts with training on Airbnb’s “paved road,” the technical stack engineers need to know to work with a large production codebase. In the program’s second half, apprentices join engineering teams throughout the company and work on real projects with a dedicated team buddy.</p><p id="b606" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">Airbnb is heavily invested in the success of Connect, with the ultimate goal of participants staying with the company as full-time engineers. The program consistently had a positive atmosphere where I felt set up for success. Notably, the cohort was intentionally small (just ten people) to make sure that every apprentice had enough support and could transition to full-time as long as they put in the work. At no point did I ever feel like I was in competition with my fellow apprentices.</p><p id="c617" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">The apprentice community was very tight knit and served as a strong support system. When everybody was working remotely, we still found opportunities to gather safely and work together, and now that the office has fully reopened, there’s a group of us who continue to regularly get together even after we’ve graduated.</p><p id="1093" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">One of the qualities I respect most about Connect is the program’s commitment to feedback and development. The coordinators make an active effort to improve the apprentice experience based on our input, and they care deeply about creating an inclusive environment for people of all backgrounds.</p><h1 id="3229" class="oi oj gr be ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf bj">What I’ve learned, the hard and soft skills</h1><p id="e1da" class="pw-post-body-paragraph mt mu gr mv b mw pg my mz na ph nc nd ne pi ng nh ni pj nk nl nm pk no np nq gk bj">I’ve learned so much during my time at Airbnb, both as an apprentice and as a full-time engineer. From a purely technical point of view, I’m more confident in my engineering skills and capable of working independently on bigger projects.</p><p id="9c05" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">Originally, I expected to be a full-stack engineer since that was my bootcamp’s focus, but when organizational needs changed and my manager asked if I wanted to try mobile development, I approached that challenge head on. I’m proud to now have some larger product launches under my belt, having built iOS screens from scratch and end-to-end. That hands-on experience has given me a strong understanding of how everything fits together in a way that used to feel much more abstract.</p><p id="3b21" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">It was definitely a steep learning curve, but that process taught me valuable lessons beyond the mechanics of iOS development. As paradoxical as it sounds, I’ve found the key is to get comfortable being uncomfortable. If you’re challenging yourself enough, you’ll constantly be learning and working with unfamiliar technologies. Being accepting of that and approaching problems one step at a time is the most productive way forward.</p><p id="84c9" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">And then there’s communication. I can’t stress enough how important it is to be your biggest advocate (all the more so when working remotely!). Be clear about what you need with your manager, teammates, and collaborators. Especially in a work culture like Airbnb’s, people genuinely want you to succeed. The way to facilitate that is to openly share with your team. In my case, I’ve communicated my mental health needs with my team and that’s been so helpful in shaping my experience for the better.</p><p id="5e03" class="pw-post-body-paragraph mt mu gr mv b mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq gk bj">The learnings I’ve made here at Airbnb are a large part of why I’m interested in exploring mentorship and management in the future. The community from both the Connect program and Airbnb at large has taught me so much. I’d love to pay that forward and help future Airbnb employees grow as much as I have. We’re hiring for many roles, so take a look at <a class="af oh" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">our career page</a> and who knows, maybe we’ll be working together soon enough.</p></div></div></div></div></div></div></div></div></section></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="My Journey to Airbnb — Michael Kinoti" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06"><div class="uo up uq ur us"><img alt="My Journey to Airbnb — Michael Kinoti" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/1*X0-h_g8Qrt3TWzbOuBzzMw.jpeg" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@lauren.mackevich"><div class="l fg"><img alt="Lauren Mackevich" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*-imhApAGWwgM89i1.jpg" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@lauren.mackevich"><p class="be b du z jh ji jj jk jl jm jn jo bj">Lauren Mackevich</p></a></div></div></div><div class="ve l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">My Journey to Airbnb — Michael Kinoti</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">Saying no to med school and following a dream all the way to Silicon Valley</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06"><div class="ab q">7 min read·Apr 26</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F645d4c228d06&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-michael-kinoti-645d4c228d06&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">4</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Unlocking SwiftUI at Airbnb" rel="noopener follow" href="https://medium.com/airbnb-engineering/unlocking-swiftui-at-airbnb-ea58f50cde49"><div class="uo up uq ur us"><img alt="Unlocking SwiftUI at Airbnb" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/1*vPOnFshuzNBfYTNpuDbvug.jpeg" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@bryn.bodayle"><div class="l fg"><img alt="Bryn Bodayle" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*Mvz40LLgt-EWM70IcGeGLg.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@bryn.bodayle"><p class="be b du z jh ji jj jk jl jm jn jo bj">Bryn Bodayle</p></a></div></div></div><div class="ve l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/unlocking-swiftui-at-airbnb-ea58f50cde49"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">Unlocking SwiftUI at Airbnb</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">How Airbnb adopted SwiftUI in our iOS app</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/unlocking-swiftui-at-airbnb-ea58f50cde49"><div class="ab q">10 min read·Sep 21</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fea58f50cde49&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Funlocking-swiftui-at-airbnb-ea58f50cde49&amp;user=Bryn+Bodayle&amp;userId=d28080b8ee0"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/unlocking-swiftui-at-airbnb-ea58f50cde49?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">29</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="A Deep Dive into Airbnb’s Server-Driven UI System" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5"><div class="uo up uq ur us"><img alt="A Deep Dive into Airbnb’s Server-Driven UI System" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/0*CedYKpSYMIGEiX7m" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rbro112"><div class="l fg"><img alt="Ryan Brooks" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*py_8uAIKHqAuW89G5PgOeQ.png" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@rbro112"><p class="be b du z jh ji jj jk jl jm jn jo bj">Ryan Brooks</p></a></div></div></div><div class="ve l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">A Deep Dive into Airbnb’s Server-Driven UI System</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">How Airbnb ships features faster across web, iOS, and Android using a server-driven UI system named Ghost Platform ?.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5"><div class="ab q">11 min read·Jun 29, 2021</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F842244c5f5&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fa-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5&amp;user=Ryan+Brooks&amp;userId=4c31895f4c38"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">35</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="My Journey to Airbnb — Veerabahu Chandran" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06"><div class="uo up uq ur us"><img alt="My Journey to Airbnb — Veerabahu Chandran" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/1*wwf3CMkjhKPlaxichQJd1g.jpeg" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@lauren.mackevich"><div class="l fg"><img alt="Lauren Mackevich" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*-imhApAGWwgM89i1.jpg" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@lauren.mackevich"><p class="be b du z jh ji jj jk jl jm jn jo bj">Lauren Mackevich</p></a></div></div></div><div class="ve l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">My Journey to Airbnb — Veerabahu Chandran</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">Learning and growing in Airbnb’s new Bangalore Tech Center</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06"><div class="ab q">5 min read·Aug 18, 2022</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F70468aa3bc06&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-veerabahu-chandran-70468aa3bc06&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">1</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Risk tolerance: why some countries prefer more complex UIs" rel="noopener follow" href="https://medium.com/user-experience-design-1/risk-tolerance-why-some-countries-prefer-more-complex-uis-25dae4402df4"><div class="uo up uq ur us"><img alt="Risk tolerance: why some countries prefer more complex UIs" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/1*s4JL1Yy_Xj-jZu8gno8Kfg.png" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@baswallet"><div class="l fg"><img alt="Bas Wallet" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*CvRGlk6LaMOjWSdgQ_ovOg.png" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@baswallet"><p class="be b du z jh ji jj jk jl jm jn jo bj">Bas Wallet</p></a></div></div></div><div class="ve l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/user-experience-design-1" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">UX Collective</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/user-experience-design-1/risk-tolerance-why-some-countries-prefer-more-complex-uis-25dae4402df4"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">Risk tolerance: why some countries prefer more complex UIs</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">An analysis of Uncertainty Avoidance and Amazon’s website in various countries</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/user-experience-design-1/risk-tolerance-why-some-countries-prefer-more-complex-uis-25dae4402df4"><div class="ab q">10 min read·3 days ago</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fuser-experience-design-1%2F25dae4402df4&amp;operation=register&amp;redirect=https%3A%2F%2Fuxdesign.cc%2Frisk-tolerance-why-some-countries-prefer-more-complex-uis-25dae4402df4&amp;user=Bas+Wallet&amp;userId=55d92f9bd1aa"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/user-experience-design-1/risk-tolerance-why-some-countries-prefer-more-complex-uis-25dae4402df4?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">12</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="A personal, non-partisan perspective on the Israel-Hamas war" rel="noopener follow" href="https://medium.com/@isaac_1884/the-attacks-on-israel-and-the-response-15ed50e63da6"><div class="uo up uq ur us"><img alt="A personal, non-partisan perspective on the Israel-Hamas war" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/1*CX3RDZAwMVtNku1-ONB13w.jpeg" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@isaac_1884"><div class="l fg"><img alt="Isaac Saul" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*5bMy4uvQlQxjhi_xhBlJJQ.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@isaac_1884"><p class="be b du z jh ji jj jk jl jm jn jo bj">Isaac Saul</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@isaac_1884/the-attacks-on-israel-and-the-response-15ed50e63da6"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">A personal, non-partisan perspective on the Israel-Hamas war</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">To understand this war, we must understand the thousand-year history that led us here</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@isaac_1884/the-attacks-on-israel-and-the-response-15ed50e63da6"><div class="ab q">11 min read·5 days ago</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F15ed50e63da6&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40isaac_1884%2Fthe-attacks-on-israel-and-the-response-15ed50e63da6&amp;user=Isaac+Saul&amp;userId=22ceabaa20f6"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/@isaac_1884/the-attacks-on-israel-and-the-response-15ed50e63da6?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">360</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Laravel is weird" rel="noopener follow" href="https://medium.com/@vitoriodachef/laravel-is-weird-919c2ba199be"><div class="uo up uq ur us"><img alt="Laravel is weird" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/0*dPqitEoUQzi9GROm" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@vitoriodachef"><div class="l fg"><img alt="Victor Todoran" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*_pLrOrmqdPA2StbwDb5naQ.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@vitoriodachef"><p class="be b du z jh ji jj jk jl jm jn jo bj">Victor Todoran</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@vitoriodachef/laravel-is-weird-919c2ba199be"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">Laravel is weird</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">Among the developers I've worked with, Laravel was never popular, the main reasons being ActiveRecord and the fact that it does not feel…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@vitoriodachef/laravel-is-weird-919c2ba199be"><div class="ab q">5 min read·Oct 1</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F919c2ba199be&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40vitoriodachef%2Flaravel-is-weird-919c2ba199be&amp;user=Victor+Todoran&amp;userId=7a3435f4529c"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/@vitoriodachef/laravel-is-weird-919c2ba199be?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">14</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="iOS Interview Questions 2023" rel="noopener follow" href="https://medium.com/@knoo/ios-interview-questions-2023-7fd56079f363"><div class="uo up uq ur us"><img alt="iOS Interview Questions 2023" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/1*LfK6xayjl4uIb-Cu4lLBeQ.png" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@knoo"><div class="l fg"><img alt="Knyaz Harutyunyan" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*9VjR_9CY8hkEXr1O5oAnPA.png" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@knoo"><p class="be b du z jh ji jj jk jl jm jn jo bj">Knyaz Harutyunyan</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@knoo/ios-interview-questions-2023-7fd56079f363"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">iOS Interview Questions 2023</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">Hi folks, I want to share my knowledge about the Swift programming language and the UIKit framework. I encountered these questions while…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@knoo/ios-interview-questions-2023-7fd56079f363"><div class="ab q">10 min read·Sep 9</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F7fd56079f363&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40knoo%2Fios-interview-questions-2023-7fd56079f363&amp;user=Knyaz+Harutyunyan&amp;userId=bc5d8354b270"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/@knoo/ios-interview-questions-2023-7fd56079f363?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">4</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Clean Code: My Notes" rel="noopener follow" href="https://medium.com/@ktiarad/clean-code-my-notes-2a289cbe97cc"><div class="uo up uq ur us"><img alt="Clean Code: My Notes" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/0*Z3ysJC4NVoisD-ff" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ktiarad"><div class="l fg"><img alt="Tiara Dewangga" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*N_1AeLwcKzf-q897RPI66Q.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@ktiarad"><p class="be b du z jh ji jj jk jl jm jn jo bj">Tiara Dewangga</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ktiarad/clean-code-my-notes-2a289cbe97cc"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">Clean Code: My Notes</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">Basic, simple, yet often overlooked. Here are Tiara’s notes after reading ‘Clean Code’ by Uncle Bob and its application in Go.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ktiarad/clean-code-my-notes-2a289cbe97cc"><div class="ab q">6 min read·Sep 29</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F2a289cbe97cc&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40ktiarad%2Fclean-code-my-notes-2a289cbe97cc&amp;user=Tiara+Dewangga&amp;userId=3eb2e6c2ec38"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/@ktiarad/clean-code-my-notes-2a289cbe97cc?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">3</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qc l"><div class="bg dv"><div class="dv l"><div class="dv tz ua ub uc ud ue uf ug uh ui uj uk ul"><div class="um"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="How VPNs really work" rel="noopener follow" href="https://medium.com/@hnasr/how-vpns-really-work-a5da843d0eb3"><div class="uo up uq ur us"><img alt="How VPNs really work" class="bg ut uu uv uw bw" src="https://miro.medium.com/v2/resize:fit:1358/1*Z11OnmcvEME2wzzLuw8N1Q.png" /></div></a></div><div class="un ab ca cn"><div class="ux uy uz va vb ab"><div class="po l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@hnasr"><div class="l fg"><img alt="Hussein Nasser" class="l fa bx vc vd cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*j-h09TiaKTgYsIvVAHPa4Q@2x.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ve l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@hnasr"><p class="be b du z jh ji jj jk jl jm jn jo bj">Hussein Nasser</p></a></div></div></div></div><div class="vf vg vh vi vj vk vl vm vn vo l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@hnasr/how-vpns-really-work-a5da843d0eb3"><div title=""><h2 class="be gs ol on vp vq oo op or vr vs os ne vt vu vv vw ni vx vy vz wa nm wb wc wd we jh jj jk jm jo bj">How VPNs really work</h2></div><div class="wf l"><h3 class="be b iq z jh wg jj jk wh jm jo dt">Under the hood</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@hnasr/how-vpns-really-work-a5da843d0eb3"><div class="ab q">3 min read·Oct 2</div></a><div class="wi wj wk wl wm l"><div class="ab co"><div class="am wn wo wp wq wr ws wt wu wv ww ab q"><div class="ab q kx"><div class="pw-multi-vote-icon fg jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2Fa5da843d0eb3&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40hnasr%2Fhow-vpns-really-work-a5da843d0eb3&amp;user=Hussein+Nasser&amp;userId=e4cbe924ccb"><div><div class="bl" aria-hidden="false"><div class="lb ao lc ld le lf am lg lh li la"></div></div></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="wx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/@hnasr/how-vpns-really-work-a5da843d0eb3?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">9</p></a></div></div></div></div><div class="ab q wz xa"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-helena-zarazua-3b0aad94a04a</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-helena-zarazua-3b0aad94a04a</guid>
      <pubDate>Wed, 18 Oct 2023 17:40:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Unlocking SwiftUI at Airbnb]]></title>
      <description><![CDATA[<article><div class="l"><div class="l"><section><div><div class="gk gl gm gn go"><div class="ab ca"><div class="ch bg fw fx fy fz"><div><h2 id="85c5" class="pw-subtitle-paragraph ho gq gr be b hp hq hr hs ht hu hv hw hx hy hz ia ib ic id cp dt">How Airbnb adopted SwiftUI in our iOS app</h2><div class="ie if ig ih ii"><div class="speechify-ignore ab co"><div class="speechify-ignore bg l"><div class="ij ik il im in ab"><div><div class="ab io"><a rel="noopener follow" href="https://medium.com/@bryn.bodayle"><div><div class="bl" aria-hidden="false"><div class="l ip iq bx ir is"><div class="l fg"><img alt="Bryn Bodayle" class="l fa bx dc dd cw" src="https://miro.medium.com/v2/resize:fill:88:88/1*Mvz40LLgt-EWM70IcGeGLg.jpeg" width="44" height="44" data-testid="authorPhoto" /></div></div></div></div></a><a href="https://medium.com/airbnb-engineering" rel="noopener follow"><div class="iw ab fg"><div><div class="bl" aria-hidden="false"><div class="l ix iy bx ir iz"><div class="l fg"><img alt="The Airbnb Tech Blog" class="l fa bx bq ja cw" src="https://miro.medium.com/v2/resize:fill:48:48/1*MlNQKg-sieBGW5prWoe9HQ.jpeg" width="24" height="24" data-testid="publicationPhoto" /></div></div></div></div></div></a></div></div><div class="bm bg l"><div class="ab"><div><div class="jb ab q"><div class="ab q jc"><div class="ab q"><div><div class="bl" aria-hidden="false"><p class="be b jd je bj"><a class="af ag ah ai aj ak al am an ao ap aq ar jf" data-testid="authorName" rel="noopener follow" href="https://medium.com/@bryn.bodayle">Bryn Bodayle</a></p></div></div></div>·<p class="be b jd je dt"><a class="ji jj ah ai aj ak al am an ao ap aq ar eu jk jl" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2Fd28080b8ee0&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Funlocking-swiftui-at-airbnb-ea58f50cde49&amp;user=Bryn+Bodayle&amp;userId=d28080b8ee0">Follow</a></p></div></div></div></div><div class="l jm"><div class="ab cm jn jo jp"><div class="jq jr ab"><div class="be b bf z dt ab js">Published in<div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" data-testid="publicationName" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b bf z ju jv jw jx jy jz ka kb bj">The Airbnb Tech Blog</p></a></div></div></div><div class="h k">·</div></div><div class="ab ae">10 min read<div class="kc kd l" aria-hidden="true">·</div>Just now</div></div></div></div></div><div class="ab co ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt"><div class="h k w fd fe q"><div class="lj l"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" data-testid="headerClapButton" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fea58f50cde49&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Funlocking-swiftui-at-airbnb-ea58f50cde49&amp;user=Bryn+Bodayle&amp;userId=d28080b8ee0"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div></div><div><div class="bl" aria-hidden="false"></div></div><div class="ab q ku kv kw kx ky kz la lb lc ld le lf lg lh li"><div class="h k"><div><div class="bl" aria-hidden="false"></div></div><div class="fa uh cm"><div class="l ae"><div class="ab ca"><div class="ui uj uk ul um oq ch bg"><div class="ab"><div class="bl bg" aria-hidden="false"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div><div class="bl" aria-hidden="false" aria-describedby="postFooterSocialMenu" aria-labelledby="postFooterSocialMenu"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div><p id="83b2" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj"><a class="af oc" href="https://www.linkedin.com/in/brynbodayle" rel="noopener ugc nofollow" target="_blank">Bryn Bodayle</a></p><figure class="og oh oi oj ok ol od oe paragraph-image"><div role="button" tabindex="0" class="om on fg oo bg op"><div class="od oe of"><picture></picture></div></div></figure><p id="5391" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">When constructing an app’s user interface (UI), the choice of framework is incredibly important. The right UI framework can make an app feel smooth, responsive, even delightful, while a UI framework that doesn’t match an app’s needs can make it feel sluggish and broken. This principle extends to developer experience as well; a UI framework with well-designed APIs can enable engineers to express themselves fluently, efficiently, and correctly, while one with the wrong abstractions or inconsistent APIs can make engineers’ jobs more difficult by slowing them down with unnecessary complexity.</p><p id="8abb" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">At Airbnb, we want our mobile apps to provide a world-class user experience <em class="os">and</em> a world-class developer experience. This desire led us to build our own UI framework named <a class="af oc" rel="noopener" href="https://medium.com/airbnb-engineering/introducing-epoxy-for-ios-6bf062be1670">Epoxy</a> in 2016. Epoxy is a declarative UI framework, which means that engineers describe <em class="os">what</em> their UI should be structured like for a given screen state and the framework then figures out <em class="os">how</em> to make updates to the view hierarchy to render the screen contents. Epoxy uses UIKit under the hood to render views.</p><p id="2b1f" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">The iOS UI framework landscape shifted in 2019 with the introduction of <a class="af oc" href="https://developer.apple.com/tutorials/swiftui" rel="noopener ugc nofollow" target="_blank">SwiftUI</a>, a first-party declarative UI framework that accomplishes many of the same goals as Epoxy. Although SwiftUI was <a class="af oc" href="https://github.com/airbnb/epoxy-ios/wiki/FAQ#why-would-i-use-epoxy-and-uikit-instead-of-swiftui" rel="noopener ugc nofollow" target="_blank">not a good fit for our needs</a> during its first three years, by 2022 it offered increased stability and API availability. It was around this time that we started to consider adopting SwiftUI at Airbnb.</p><p id="23f6" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">In this post, we share why and how we ultimately replaced Epoxy and UIKit with SwiftUI at Airbnb. We’ll detail how we integrated SwiftUI into Airbnb’s design system, explain the results of this effort, and enumerate a few challenges we’re still working through. After reading this post you’ll understand why SwiftUI has met our high bar for both user and developer experience.</p><h1 id="9526" class="ot ou gr be ov ow ox hr oy oz pa hu pb pc pd pe pf pg ph pi pj pk pl pm pn po bj">Evaluating and Planning for SwiftUI</h1><p id="6ab9" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">Switching to a new UI framework is not a task that should be undertaken lightly. After much investigation, we posited that SwiftUI would not regress the user experience and would improve developer experience because of the following hypotheses:</p><ul class=""><li id="f1db" class="ng nh gr ni b hp nj nk nl hs nm nn no np pu nr ns nt pv nv nw nx pw nz oa ob px py pz bj"><strong class="ni gs">Flexible and composable: </strong>SwiftUI would offer more powerful and flexible patterns to manage view variants and styling along with <a class="af oc" href="https://www.swiftbysundell.com/tips/swiftui-extensions-using-generics/" rel="noopener ugc nofollow" target="_blank">generic views</a> and <a class="af oc" href="https://developer.apple.com/documentation/swiftui/viewmodifier" rel="noopener ugc nofollow" target="_blank">view modifiers</a>. This should substantially reduce the number of views required to build the app, since it would be both easier to customize existing views and to compose new behavior inline at the callsite.</li><li id="c3ed" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj"><strong class="ni gs">Fully declarative:</strong> SwiftUI code would be simpler to reason about and change over time. There should typically be no context switching between imperative and declarative coding paradigms like we had in Epoxy, for which engineers frequently needed to “drop down” into UIKit code.</li><li id="0024" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj"><strong class="ni gs">Less code:</strong> As a result of SwiftUI being fully declarative, we believed it would take dramatically less code to build a SwiftUI view component. Generally, bug count correlates with lines of code.</li><li id="b7e7" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj"><strong class="ni gs">Faster iteration:</strong> <a class="af oc" href="https://developer.apple.com/documentation/swiftui/previews-in-xcode" rel="noopener ugc nofollow" target="_blank">Xcode previews</a> would enable near-instant iteration cycles on SwiftUI view components and screens, as compared to 30 second or more build and run iteration cycles with UIKit.</li><li id="393c" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj"><strong class="ni gs">Idiomatic:</strong> SwiftUI would lower cognitive overhead when building UI, due to fewer custom paradigms and patterns. This would make it easier to onboard new engineers.</li></ul><p id="e77b" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">With these hypotheses in mind, we hatched a plan to evaluate and to adopt SwiftUI in three phases:</p><ul class=""><li id="9f6c" class="ng nh gr ni b hp nj nk nl hs nm nn no np pu nr ns nt pv nv nw nx pw nz oa ob px py pz bj"><strong class="ni gs">Phase 1:</strong> Build leaf views, such as reusable view components, from our design system</li><li id="f353" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj"><strong class="ni gs">Phase 2:</strong> Build entire screens such as the reservation details page or the user profile page</li><li id="1cf7" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj"><strong class="ni gs">Phase 3:</strong> Build complete features composed from multiple screens</li></ul><p id="0f09" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">As of the writing of this post, we have successfully completed the first two phases of SwiftUI adoption and for Phase Three await flexible navigation APIs to be added to SwiftUI. For the component (Phase One) and screen (Phase Two) phases, we conducted a small pilot in which engineers signed up to try SwiftUI for their use cases. The pilots were used to collect feedback and improve our SwiftUI support at that phase before progressing to the next. This approach enabled us to deliver value at each stage of adoption, as opposed to adopting SwiftUI for whole features from the get-go with a large and uncertain infrastructure investment upfront.</p><h1 id="343d" class="ot ou gr be ov ow ox hr oy oz pa hu pb pc pd pe pf pg ph pi pj pk pl pm pn po bj">Enabling SwiftUI</h1><p id="bd93" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">We made a number of infrastructure and education investments to set engineers up for success.</p><h2 id="d23d" class="qf ou gr be ov qg qh dx oy qi qj dz pb np qk ql qm nt qn qo qp nx qq qr qs qt bj">Design System</h2><p id="f526" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">First-class SwiftUI support for <a class="af oc" href="https://airbnb.design/building-a-visual-language/" rel="noopener ugc nofollow" target="_blank">Airbnb’s design system</a> was a key priority for accelerating SwiftUI adoption company-wide. Instead of merely bridging our existing UIKit components, we rebuilt the design system in SwiftUI to make it far more flexible and powerful.</p><p id="f2aa" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">Every view component in our design system supports styling to improve reusability via customization. We have a series of style protocols which, when combined with generated code, allow us to pass style objects down through the SwiftUI environment to mimic SwiftUI’s built in styling paradigms. One type of styling that conforms to this protocol is called “flexible styles”. Here’s some example code:</p><pre class="og oh oi oj ok qu qv qw bo qx ba bj">public protocol FlexibleSwiftUIViewStyle: DynamicProperty {  /// The content view type of this style, passed to `body()`.  associatedtype Content  /// The type of view representing the body.  associatedtype Body: View  /// Renders a view for this style.  @ViewBuilder  func body(content: Content) -&gt; Body}</pre><p id="f5eb" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">This protocol allows us to create a style object with a collection of settable properties that can completely customize the rendering of a component. A content object is passed to the style so that it can access the view’s underlying state or interactions when creating a new view body. Here is an example style implementation for a numeric stepper (with some styling omitted for brevity):</p><pre class="og oh oi oj ok qu qv qw bo qx ba bj">public struct DefaultStepperStyle: DLSNumericStepperStyle {  public var valueLabel = TextStyle…public func body(content: DLSNumericStepperStyleContent) -&gt; some View {    HStack {      Button(action: content.onDecrement) { subtractIcon }        .disabled(content.atLowerBound)      Text(content.description)        .textStyle(valueLabel)      Button(action: content.onIncrement) { addIcon }        .disabled(content.atUpperBound)    }  }}</pre><figure class="og oh oi oj ok ol od oe paragraph-image"><div class="od oe rd"><picture></picture></div><figcaption class="re fc rf od oe rg rh be b bf z dt"><em class="ri">Example stepper created from the default style properties</em></figcaption></figure><p id="ef72" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">However, with flexible styles engineers can add an entirely custom stepper style with just a few dozen lines of code by implementing a new type that conforms to DLSNumericStepperStyle. That style can be set on a view using an autogenerated view modifier:</p><pre class="og oh oi oj ok qu qv qw bo qx ba bj">DLSNumericStepper(value: $value, in: 0...)  .dlsNumericStepperStyle(CustomStepperStyle())</pre><figure class="og oh oi oj ok ol od oe paragraph-image"><div class="od oe rd"><picture></picture></div><figcaption class="re fc rf od oe rg rh be b bf z dt"><em class="ri">Example stepper created from custom style properties.</em></figcaption></figure><p id="4ac4" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">Since optimized accessibility support is implemented in the DLSNumericStepper view, custom styles automatically get the appropriate accessibility behavior. We have used this flexible styling approach throughout the implementation of our design system, which allows product engineers to build new component variations quickly and without accessibility bugs.</p><h1 id="4b42" class="ot ou gr be ov ow ox hr oy oz pa hu pb pc pd pe pf pg ph pi pj pk pl pm pn po bj">Epoxy Bridging</h1><p id="548c" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj"><a class="af oc" href="https://airbnb.io/projects/epoxy-ios/" rel="noopener ugc nofollow" target="_blank">Epoxy</a> powers thousands of screens in the Airbnb app. To enable seamless adoption of SwiftUI, we built infrastructure to enable Epoxy not only to bridge SwiftUI views into UIKit-based Epoxy lists, but also to bridge Epoxy UIKit views to SwiftUI.</p><p id="fab0" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">To bridge SwiftUI views to a UIKit Epoxy list, we created an <em class="os">itemModel</em> view modifier that establishes the Epoxy identity for the SwiftUI View. In the implementation, this method wraps the view into a <a class="af oc" href="https://developer.apple.com/documentation/swiftui/uihostingcontroller" rel="noopener ugc nofollow" target="_blank">UIHostingController</a> and embeds it within a collection view cell. This utility unlocked the first phase of our SwiftUI rollout by making it trivial to adopt SwiftUI in our existing Epoxy screens.</p><pre class="og oh oi oj ok qu qv qw bo qx ba bj">SwiftUIRow(  title: "Row \(id)",  subtitle: "Subtitle")  .itemModel(dataID: id)</pre><p id="eea6" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">Similarly, one can bridge UIKit views to SwiftUI with a view extension that creates a SwiftUI view from a UIKit component using its content, style invariants, and any additional view configuration. In the implementation, this API uses a generic <a class="af oc" href="https://developer.apple.com/documentation/swiftui/uiviewrepresentable" rel="noopener ugc nofollow" target="_blank">UIViewRepresentable</a>, which automatically creates and updates the UIView as its content and style change.</p><pre class="og oh oi oj ok qu qv qw bo qx ba bj">EpoxyRow.swiftUIView(  content: .init(title: "Row \(index)", subtitle: …),  style: .small)  .configure { context in    print("Configuring \(context.view)")  }  .onTapGesture {    print("Row \(index) tapped!")  }</pre><p id="c735" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">Given the vastly different layout system of SwiftUI, properly laying out a UIKit component was a challenge. We developed <a class="af oc" href="https://github.com/airbnb/epoxy-ios/blob/9f9c93ced5fd6ab6e022f969c12cadc562ae79bf/Sources/EpoxyCore/SwiftUI/LayoutUtilities/SwiftUIMeasurementContainer.swift#LL281C20-L281C20" rel="noopener ugc nofollow" target="_blank">a configurable approach that</a> automatically supports complex views such as UILabel, which requires an additional layout pass to properly size.</p><h1 id="9d3d" class="ot ou gr be ov ow ox hr oy oz pa hu pb pc pd pe pf pg ph pi pj pk pl pm pn po bj">Unidirectional Data Flow</h1><p id="71f0" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">With Epoxy we found that leveraging a unidirectional data flow pattern made our UI predictable and easy to reason about. We built our screens so that the Epoxy content is rendered as a function of the screen’s state. User interactions are dispatched as actions that result in mutations to the state, which trigger a re-render of the screen. We use a StateStore object to house screen state and handle actions to mutate that state. To adapt this pattern to SwiftUI, we updated our StateStore to conform to <a class="af oc" href="https://developer.apple.com/documentation/combine/observableobject" rel="noopener ugc nofollow" target="_blank">ObservableObject</a> which allows the store to trigger a re-render of the screen’s SwiftUI View on state changes. We found that engineers preferred to continue to build screens in SwiftUI using this approach, since it enables the business and state mutation logic to be kept separate from the presentation logic. In many cases we were able to shift screen logic from Epoxy to SwiftUI screens with no changes. To illustrate the similarities, here is a simple counter screen implemented in both view systems:</p><pre class="og oh oi oj ok qu qv qw bo qx ba bj">// In Epoxy/UIKit:struct CounterContentPresenter: StateStoreContentPresenter {  let store: StateStore&lt;CounterState, CounterAction&gt;var content: UniListViewControllerContent {    .currentDLSStandardStyle()    .items {      BasicRow.itemModel(        dataID: ItemID.count,        content: .init(titleText: "Count \(state.count)"),        style: .standard)        .didSelect { _ in          store.handle(.increment)        }    }  }}</pre><pre class="rj qu qv qw bo qx ba bj">// In SwiftUIstruct CounterScreen: View {  @ObservedObject   let store: StateStore&lt;CounterState, CounterAction&gt;var body: some View {    DLSListScreen {      DLSRow(title: "Count \(store.state.count)")        .highlightEffectButton {          store.handle(.increment)        }    }  }}</pre><h1 id="32e5" class="ot ou gr be ov ow ox hr oy oz pa hu pb pc pd pe pf pg ph pi pj pk pl pm pn po bj">Testability</h1><p id="d154" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">To ensure a high quality product, we wanted our SwiftUI code to be testable by design. <a class="af oc" href="https://bitrise.io/blog/post/snapshot-testing-in-ios-testing-the-ui-and-beyond" rel="noopener ugc nofollow" target="_blank">Snapshot testing</a> is our primary approach for testing views, so we use a static definition to provide named view variants both to our component browser and to our snapshot testing service:</p><pre class="og oh oi oj ok qu qv qw bo qx ba bj">enum DLSPrimaryButton_Definition: ViewDefinition, PreviewProvider {  static var contentVariants: ContentVariants {    DLSPrimaryButton(title: "Title") { … }      .named("Short text")DLSPrimaryButton(title: "Title") { … }      .disabled(true)      .named("Disabled")  }}</pre><p id="fcaf" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">Since we’re returning view variants here, there is a lot of flexibility in what you can test–the framework accepts any content variation or combination of view modifiers. Additionally, we conform these definitions to SwiftUI’s PreviewProviderprotocol and convert these content variants into the expected return type so that engineers can rapidly iterate on the component using <a class="af oc" href="https://developer.apple.com/documentation/swiftui/previews-in-xcode" rel="noopener ugc nofollow" target="_blank">Xcode Previews</a>.</p><p id="fb53" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">Unlike declarative UI frameworks on other platforms, SwiftUI does not provide a built-in testing library. In order to support <a class="af oc" href="https://testing-library.com/docs/react-testing-library/example-intro" rel="noopener ugc nofollow" target="_blank">behavioral</a>-<a class="af oc" href="https://developer.android.com/jetpack/compose/testing" rel="noopener ugc nofollow" target="_blank">style</a> tests of components and screens, we integrated the open source <a class="af oc" href="https://github.com/nalexn/ViewInspector" rel="noopener ugc nofollow" target="_blank">ViewInspector library</a>, to which we’ve also contributed.</p><h1 id="3d1d" class="ot ou gr be ov ow ox hr oy oz pa hu pb pc pd pe pf pg ph pi pj pk pl pm pn po bj">Education</h1><p id="f98c" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">We heard from some of our peer companies that a significant challenge in adopting SwiftUI was building in-house expertise across a large iOS team. To address this proactively, we held multiple half-week SwiftUI workshops focused on SwiftUI fundamentals, which nearly half of our iOS engineering team attended. Attendees reported that their confidence in SwiftUI fundamentals increased by 37%, and their confidence in building new components increased by 39%. Additionally, we found that attendees reported their SwiftUI expertise as 8% higher than those that did not attend a workshop nearly a year later.</p><h1 id="964f" class="ot ou gr be ov ow ox hr oy oz pa hu pb pc pd pe pf pg ph pi pj pk pl pm pn po bj">Findings on SwiftUI</h1><h2 id="b463" class="qf ou gr be ov qg qh dx oy qi qj dz pb np qk ql qm nt qn qo qp nx qq qr qs qt bj">Lines of Code</h2><p id="fb57" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">Given Airbnb’s multimillion line iOS codebase, we were excited by the potential for SwiftUI to reduce the amount of code required to build UI. In an early experiment in which we rewrote our review card we saw a <em class="os">6x reduction in lines of code</em> –from 1,121 lines to a mere 174 lines of code! Over the past 2 years we have seen reductions in lines of code of similar magnitudes as our SwiftUI adoption has progressed.</p><h2 id="1727" class="qf ou gr be ov qg qh dx oy qi qj dz pb np qk ql qm nt qn qo qp nx qq qr qs qt bj">Performance</h2><p id="561d" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">UI performance was a key concern as we evaluated SwiftUI. Fortunately, after running multiple experiments, we verified that the <a class="af oc" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">page performance score</a> when using SwiftUI was comparable to a UIKit implementation. We noticed a small overhead when instantiating UIHostingController, but were able to reduce this by adding <a class="af oc" href="https://github.com/airbnb/epoxy-ios/blob/master/Sources/EpoxyCore/SwiftUI/EpoxySwiftUIHostingView.swift#L12" rel="noopener ugc nofollow" target="_blank">a reuse pool of hosting controllers</a> to Epoxy.</p><h2 id="35ef" class="qf ou gr be ov qg qh dx oy qi qj dz pb np qk ql qm nt qn qo qp nx qq qr qs qt bj">Adoption &amp; Developer Satisfaction</h2><p id="c41b" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">With much excitement about SwiftUI within the company, organic adoption of the framework has been rapid. Our limited pilot of building components in SwiftUI began in January 2022, with general availability beginning later that May. Building entire screens in SwiftUI entered the pilot phase in October 2022 and then entered general availability in January 2023.</p><p id="f927" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">As of September, we have over 500 SwiftUI views and roughly 200 SwiftUI screens. Many of the screens for <a class="af oc" href="https://www.airbnb.com/release" rel="noopener ugc nofollow" target="_blank">Airbnb’s 2023 Summer Release</a> were fully powered by SwiftUI.</p><figure class="og oh oi oj ok ol od oe paragraph-image"><div role="button" tabindex="0" class="om on fg oo bg op"><div class="od oe rk"><picture></picture></div></div><figcaption class="re fc rf od oe rg rh be b bf z dt"><em class="ri">The growth of SwiftUI views and screens in Airbnb’s product.</em></figcaption></figure><p id="c263" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">Airbnb’s iOS engineers are also highly satisfied with SwiftUI. In our most recent survey, 77% of survey respondents said that SwiftUI improved their efficiency. Many respondents mentioned that their efficiency would improve further with more SwiftUI experience, including those that rated it as slowing them down. 100% of survey respondents said that SwiftUI did not negatively affect the quality of their features, and some cited SwiftUI as an improvement to their code quality.</p><h1 id="c7a1" class="ot ou gr be ov ow ox hr oy oz pa hu pb pc pd pe pf pg ph pi pj pk pl pm pn po bj">Challenges</h1><p id="3635" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">Though the move to SwiftUI has by and large been a major success, we have encountered the following challenges:</p><ul class=""><li id="a173" class="ng nh gr ni b hp nj nk nl hs nm nn no np pu nr ns nt pv nv nw nx pw nz oa ob px py pz bj">While Swift and its surrounding foundation have been open sourced, SwiftUI’s implementation remains a black box. If SwiftUI were open sourced, we could better understand the framework and debug more effectively.</li><li id="26d7" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj">Our visibility into the evolution of SwiftUI is limited to yearly announcements. If we had a clearer understanding of where SwiftUI is headed, we could better prioritize our adoption focus and know where to invest in custom solutions.</li><li id="fa33" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj">Airbnb supports the latest two iOS versions. If newer SwiftUI APIs were backported to older iOS versions, we could take advantage of powerful new features more quickly and spend less time writing fallback solutions.</li><li id="999d" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj">In order to fully drop UIKit, we will need a set of SwiftUI APIs that support custom transitions and navigation patterns.</li><li id="8502" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj">We’ve run into a number of challenges and limitations using LazyVStack and ScrollView, including:</li><li id="a025" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj">Insertion, removal, and update animations are often broken.</li><li id="3bd6" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj">Prefetching offscreen cells and prefetching images or data is not possible.</li><li id="5bd5" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj">Some <a class="af oc" href="https://openradar.appspot.com/FB9900814" rel="noopener ugc nofollow" target="_blank">states are reset</a> when scrolled offscreen.</li><li id="fe59" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj">The SwiftUI APIs for text input don’t support all the features which their UIKit counterparts supported, so engineers must bridge to UIKit.</li><li id="61b1" class="ng nh gr ni b hp qa nk nl hs qb nn no np qc nr ns nt qd nv nw nx qe nz oa ob px py pz bj">We have 18 open feedbacks with Apple that document SwiftUI bugs or enhancements that we’ve come across.</li></ul><h1 id="0d38" class="ot ou gr be ov ow ox hr oy oz pa hu pb pc pd pe pf pg ph pi pj pk pl pm pn po bj">Conclusion</h1><p id="8f59" class="pw-post-body-paragraph ng nh gr ni b hp pp nk nl hs pq nn no np pr nr ns nt ps nv nw nx pt nz oa ob gk bj">In spite of these challenges, overall we have experienced smooth sailing in our careful adoption of SwiftUI at Airbnb. By rebuilding our design system, prioritizing education, and providing seamless integration with our existing frameworks, we have improved developer velocity and satisfaction while maintaining a high quality bar. We’re excited to watch SwiftUI continue to evolve and power more experiences in our app!</p></div></div></div><div class="ab ca rl rm rn ro" role="separator"><div class="gk gl gm gn go"><div class="ab ca"><div class="ch bg fw fx fy fz"><p id="fd1c" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj">Thanks to Eric Horacek, Matthew Cheok, Michael Bachand, Rafael Assis, Ortal Yahdav, Nick Fox and many others for all of their contributions to SwiftUI at Airbnb.</p></div></div></div><div class="ab ca rl rm rn ro" role="separator"><div class="gk gl gm gn go"><div class="ab ca"><div class="ch bg fw fx fy fz"><p id="f42b" class="pw-post-body-paragraph ng nh gr ni b hp nj nk nl hs nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob gk bj"><em class="os">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div></div></div></div></div></div></section></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Avoiding Double Payments in a Distributed Payments System" rel="noopener follow" href="https://medium.com/airbnb-engineering/avoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb"><div class="wu wv ww wx wy"><img alt="Avoiding Double Payments in a Distributed Payments System" class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/1*vDoYk7bf-GgFBhcgDzRrGA.jpeg" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@jon.j.chew"><div class="l fg"><img alt="Jon Chew" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*PDQg8XCdHlaVlivuxpDl4Q.jpeg" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@jon.j.chew"><p class="be b du z ju jv jw jx jy jz ka kb bj">Jon Chew</p></a></div></div></div><div class="xk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z ju jv jw jx jy jz ka kb bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/avoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">Avoiding Double Payments in a Distributed Payments System</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">How we built a generic idempotency framework to achieve eventual consistency and correctness across our payments micro-service…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/avoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb"><div class="ab q"><div class="sk ab"><div class="bl" aria-hidden="false"></div>·14 min read·Apr 16, 2019</div></div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F2981f6b070bb&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Favoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb&amp;user=Jon+Chew&amp;userId=cc54ee66d513"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lo aj ak al mf an ao ap aq ar as at me ab q mg mh" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/avoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">41</p></a></div></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="A Deep Dive into Airbnb’s Server-Driven UI System" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5"><div class="wu wv ww wx wy"><img alt="A Deep Dive into Airbnb’s Server-Driven UI System" class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/0*CedYKpSYMIGEiX7m" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rbro112"><div class="l fg"><img alt="Ryan Brooks" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*py_8uAIKHqAuW89G5PgOeQ.png" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@rbro112"><p class="be b du z ju jv jw jx jy jz ka kb bj">Ryan Brooks</p></a></div></div></div><div class="xk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z ju jv jw jx jy jz ka kb bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">A Deep Dive into Airbnb’s Server-Driven UI System</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">How Airbnb ships features faster across web, iOS, and Android using a server-driven UI system named Ghost Platform ?.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5"><div class="ab q">11 min read·Jun 29, 2021</div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F842244c5f5&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fa-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5&amp;user=Ryan+Brooks&amp;userId=4c31895f4c38"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lo aj ak al mf an ao ap aq ar as at me ab q mg mh" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">35</p></a></div></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="How Airbnb Achieved Metric Consistency at Scale" rel="noopener follow" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70"><div class="wu wv ww wx wy"><img alt="How Airbnb Achieved Metric Consistency at Scale" class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/1*rB53PQsJi73IeA-eIeucIg.png" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rchang"><div class="l fg"><img alt="Robert Chang" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*EguVA0HsIGqUy0gaDS1VgA.png" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@rchang"><p class="be b du z ju jv jw jx jy jz ka kb bj">Robert Chang</p></a></div></div></div><div class="xk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z ju jv jw jx jy jz ka kb bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">How Airbnb Achieved Metric Consistency at Scale</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">Part-I: Introducing Minerva — Airbnb’s Metric Platform</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70"><div class="ab q">12 min read·Apr 30, 2021</div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Ff23cc53dea70&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fhow-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70&amp;user=Robert+Chang&amp;userId=c00b242128fe"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lo aj ak al mf an ao ap aq ar as at me ab q mg mh" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">9</p></a></div></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="A person leans over the edge of a balcony. In the background are trees." rel="noopener follow" href="https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f"><div class="wu wv ww wx wy"><img alt="A person leans over the edge of a balcony. In the background are trees." class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/1*mGebUVa4KQWzQvo_YDSffQ.jpeg" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@michaelbachand"><div class="l fg"><img alt="Michael Bachand" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*hRzU_BfPKFdM77OfC1eYQw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@michaelbachand"><p class="be b du z ju jv jw jx jy jz ka kb bj">Michael Bachand</p></a></div></div></div><div class="xk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z ju jv jw jx jy jz ka kb bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">Flexible Continuous Integration for iOS</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">How Airbnb leverages AWS, Packer, and Terraform to update macOS on hundreds of CI machines in hours instead of days</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f"><div class="ab q">10 min read·May 10</div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F4ab33ea4072f&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fflexible-continuous-integration-for-ios-4ab33ea4072f&amp;user=Michael+Bachand&amp;userId=90f72207e307"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lo aj ak al mf an ao ap aq ar as at me ab q mg mh" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">5</p></a></div></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Swift Class vs. Struct" rel="noopener follow" href="https://medium.com/@ajayamati/swift-class-vs-struct-674f27abac37"><div class="wu wv ww wx wy"><img alt="Swift Class vs. Struct" class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/1*P4tkYowVZiade4FCdUYhXg.png" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ajayamati"><div class="l fg"><img alt="Ajaya Mati" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*Wno76pdklcTsfn_3ZPUvNw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@ajayamati"><p class="be b du z ju jv jw jx jy jz ka kb bj">Ajaya Mati</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ajayamati/swift-class-vs-struct-674f27abac37"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">Swift Class vs. Struct</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">A detailed difference between these two</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ajayamati/swift-class-vs-struct-674f27abac37"><div class="ab q">4 min read·4 days ago</div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F674f27abac37&amp;operation=register&amp;redirect=https%3A%2F%2Fajayamati.medium.com%2Fswift-class-vs-struct-674f27abac37&amp;user=Ajaya+Mati&amp;userId=fe5027dcd53a"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Server Driven UI with SwiftUI" rel="noopener follow" href="https://medium.com/@salgarayes/server-driven-ui-with-swiftui-99faa6a10e81"><div class="wu wv ww wx wy"><img alt="Server Driven UI with SwiftUI" class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/1*Tmy0rvQCYFzjojgi5EnJzw.png" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@salgarayes"><div class="l fg"><img alt="Yeskendir Salgara" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*or95XWmVomIL12h5Srtg1g.jpeg" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@salgarayes"><p class="be b du z ju jv jw jx jy jz ka kb bj">Yeskendir Salgara</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@salgarayes/server-driven-ui-with-swiftui-99faa6a10e81"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">Server Driven UI with SwiftUI</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">Server-Driven UI is a way of developing (I would say ‘managing content’) applications, including mobile and web, where the UI is provided…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@salgarayes/server-driven-ui-with-swiftui-99faa6a10e81"><div class="ab q"><div class="sk ab"><div class="bl" aria-hidden="false"></div>·6 min read·Sep 5</div></div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F99faa6a10e81&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40salgarayes%2Fserver-driven-ui-with-swiftui-99faa6a10e81&amp;user=Yeskendir+Salgara&amp;userId=99e8ea431498"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="SwiftUI: How to enable single/ multiple selection in List?" rel="noopener follow" href="https://medium.com/@priyankasaroha/swiftui-how-to-enable-single-multiple-selection-in-list-dc93cf9d4174"><div class="wu wv ww wx wy"><img alt="SwiftUI: How to enable single/ multiple selection in List?" class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/1*dbwdHMSc22PXhfBCPPXWSw.gif" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@priyankasaroha"><div class="l fg"><img alt="Priyanka Saroha" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*DMTUL6TzW1ImObY1ZjiQwQ.jpeg" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@priyankasaroha"><p class="be b du z ju jv jw jx jy jz ka kb bj">Priyanka Saroha</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@priyankasaroha/swiftui-how-to-enable-single-multiple-selection-in-list-dc93cf9d4174"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">SwiftUI: How to enable single/ multiple selection in List?</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">To make List selectable, you need to provide a selection variable binding for single selection. Selection binding to a Setcreates a list…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@priyankasaroha/swiftui-how-to-enable-single-multiple-selection-in-list-dc93cf9d4174"><div class="ab q">2 min read·Sep 9</div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2Fdc93cf9d4174&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40priyankasaroha%2Fswiftui-how-to-enable-single-multiple-selection-in-list-dc93cf9d4174&amp;user=Priyanka+Saroha&amp;userId=7c534d9e121e"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="The Too-Slick Apple Event" rel="noopener follow" href="https://medium.com/five-hundred-words/the-too-slick-apple-event-70c579ccec0f"><div class="wu wv ww wx wy"><img alt="The Too-Slick Apple Event" class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/1*zgmBceYqKKU9HPzQmsgwgQ.png" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@mgs"><div class="l fg"><img alt="M.G. Siegler" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*wmB2xBsvxx76tREz-zekCw.png" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@mgs"><p class="be b du z ju jv jw jx jy jz ka kb bj">M.G. Siegler</p></a></div></div></div><div class="xk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" href="https://medium.com/five-hundred-words" rel="noopener follow"><p class="be b du z ju jv jw jx jy jz ka kb bj">500ish</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/five-hundred-words/the-too-slick-apple-event-70c579ccec0f"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">The Too-Slick Apple Event</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">Some thoughts on the latest iPhone/Apple Watch event…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/five-hundred-words/the-too-slick-apple-event-70c579ccec0f"><div class="ab q">7 min read·Sep 14</div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Ffive-hundred-words%2F70c579ccec0f&amp;operation=register&amp;redirect=https%3A%2F%2F500ish.com%2Fthe-too-slick-apple-event-70c579ccec0f&amp;user=M.G.+Siegler&amp;userId=5c6977d2a94f"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lo aj ak al mf an ao ap aq ar as at me ab q mg mh" aria-label="responses" rel="noopener follow" href="https://medium.com/five-hundred-words/the-too-slick-apple-event-70c579ccec0f?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">53</p></a></div></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Making SwiftUI MVVM Easy for Beginners with Property Wrappers" rel="noopener follow" href="https://medium.com/@marufboy/making-swiftui-mvvm-easy-for-beginners-with-property-wrappers-b70aa575ad95"><div class="wu wv ww wx wy"><img alt="Making SwiftUI MVVM Easy for Beginners with Property Wrappers" class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/1*fY63F930b4fNN2FffWCURA.png" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@marufboy"><div class="l fg"><img alt="Muhammad Afif Ma'ruf" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*uoVd9NuG9_5-W1Mmvuj5vg.jpeg" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@marufboy"><p class="be b du z ju jv jw jx jy jz ka kb bj">Muhammad Afif Ma'ruf</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@marufboy/making-swiftui-mvvm-easy-for-beginners-with-property-wrappers-b70aa575ad95"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">Making SwiftUI MVVM Easy for Beginners with Property Wrappers</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">Many are scattered on the internet discussing the implementation of the Model-View-ViewModel (MVVM) pattern in a project. Using MVVM can…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@marufboy/making-swiftui-mvvm-easy-for-beginners-with-property-wrappers-b70aa575ad95"><div class="ab q">4 min read·Aug 24</div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2Fb70aa575ad95&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40marufboy%2Fmaking-swiftui-mvvm-easy-for-beginners-with-property-wrappers-b70aa575ad95&amp;user=Muhammad+Afif+Ma%27ruf&amp;userId=184611760a62"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lo aj ak al mf an ao ap aq ar as at me ab q mg mh" aria-label="responses" rel="noopener follow" href="https://medium.com/@marufboy/making-swiftui-mvvm-easy-for-beginners-with-property-wrappers-b70aa575ad95?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">1</p></a></div></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv sk l"><div class="bg dv"><div class="dv l"><div class="dv wf wg wh wi wj wk wl wm wn wo wp wq wr"><div class="ws"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Changing the Root View in SwiftUI" rel="noopener follow" href="https://medium.com/@jaykar.parmar/changing-the-root-view-in-swiftui-5aaaf1c1d66"><div class="wu wv ww wx wy"><img alt="Changing the Root View in SwiftUI" class="bg wz xa xb xc bw" src="https://miro.medium.com/v2/resize:fit:1358/1*yuVzMhCJyDENbyhwAsrkwA.png" /></div></a></div><div class="wt ab ca cn"><div class="xd xe xf xg xh ab"><div class="rw l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@jaykar.parmar"><div class="l fg"><img alt="Jaykar Parmar" class="l fa bx xi xj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*IopcQq8sVHY8h1j8nfLqKw.png" width="20" height="20" /></div></a></div></div></div><div class="xk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jf ab q" rel="noopener follow" href="https://medium.com/@jaykar.parmar"><p class="be b du z ju jv jw jx jy jz ka kb bj">Jaykar Parmar</p></a></div></div></div></div><div class="xl xm xn xo xp xq xr xs xt xu l gk"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@jaykar.parmar/changing-the-root-view-in-swiftui-5aaaf1c1d66"><div title=""><h2 class="be gs ow hr xv xw oy oz hu xx xy pb np ql xz ya qm nt qo yb yc qp nx qr yd ye qs ju jw jx jz kb bj">Changing the Root View in SwiftUI</h2></div><div class="yf l"><h3 class="be b jd z ju yg jw jx yh jz kb dt">Introduction: In SwiftUI, the ability to change the root view of your app dynamically can be a powerful feature, allowing you to tailor the…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@jaykar.parmar/changing-the-root-view-in-swiftui-5aaaf1c1d66"><div class="ab q">3 min read·Sep 2</div></a><div class="yi yj yk yl ym l"><div class="ab co"><div class="am yn yo yp yq yr ys yt yu yv yw ab q"><div class="ab q lk"><div class="pw-multi-vote-icon fg jt ll lm ln"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F5aaaf1c1d66&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40jaykar.parmar%2Fchanging-the-root-view-in-swiftui-5aaaf1c1d66&amp;user=Jaykar+Parmar&amp;userId=451d6735582d"><div class="lo ao lp lq lr ls am lt lu lv ln"></div></a></div><div class="pw-multi-vote-count l lw lx ly lz ma mb mc"><p class="be b du z dt">--</p></div></div><div class="yx l"><div><div class="bl" aria-hidden="false"><a class="af fh ah lo aj ak al mf an ao ap aq ar as at me ab q mg mh" aria-label="responses" rel="noopener follow" href="https://medium.com/@jaykar.parmar/changing-the-root-view-in-swiftui-5aaaf1c1d66?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">1</p></a></div></div></div></div><div class="ab q yz za"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article>]]></description>
      <link>https://medium.com/airbnb-engineering/unlocking-swiftui-at-airbnb-ea58f50cde49</link>
      <guid>https://medium.com/airbnb-engineering/unlocking-swiftui-at-airbnb-ea58f50cde49</guid>
      <pubDate>Thu, 21 Sep 2023 19:02:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Riverbed: Optimizing Data Access at Airbnb’s Scale]]></title>
      <description><![CDATA[<article><div class="l"><div class="l"><section><div><div class="gj gk gl gm gn"><div class="ab ca"><div class="ch bg fv fw fx fy"><div class=""><div class="hr hs ht hu hv"><div class="speechify-ignore ab co"><div class="speechify-ignore bg l"><div class="hw hx hy hz ia ab"><div><div class="ab ib"><a rel="noopener follow" href="https://medium.com/@amreshakim"><div><div class="bl" aria-hidden="false"><div class="l ic id bx ie if"><div class="l ff"><img alt="Amre Shakim" class="l fa bx dc dd cw" src="https://miro.medium.com/v2/resize:fill:88:88/1*nYGIvAZItMBVG5NkP_ALAg@2x.jpeg" width="44" height="44" /></div></div></div></div></a><a href="https://medium.com/airbnb-engineering" rel="noopener follow"><div class="ij ab ff"><div><div class="bl" aria-hidden="false"><div class="l ik il bx ie im"><div class="l ff"><img alt="The Airbnb Tech Blog" class="l fa bx bq in cw" src="https://miro.medium.com/v2/resize:fill:48:48/1*MlNQKg-sieBGW5prWoe9HQ.jpeg" width="24" height="24" /></div></div></div></div></div></a></div></div><div class="bm bg l"><div class="ab"><div><div class="io ab q"><div class="ab q ip"><div class="ab q"><div><div class="bl" aria-hidden="false"><p class="be b iq ir bj"><a class="af ag ah ai aj ak al am an ao ap aq ar is" rel="noopener follow" href="https://medium.com/@amreshakim">Amre Shakim</a></p></div></div></div>·<p class="be b iq ir dt"><a class="iv iw ah ai aj ak al am an ao ap aq ar eu ix iy" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2F746b073212a9&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Friverbed-optimizing-data-access-at-airbnbs-scale-c37ecf6456d9&amp;user=Amre+Shakim&amp;userId=746b073212a9">Follow</a></p></div></div></div></div><div class="l iz"><div class="ab cm ja jb jc"><div class="jd je ab"><div class="be b bf z dt ab jf">Published in<div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b bf z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div><div class="h k">·</div></div><div class="ab ae">7 min read<div class="jp jq l" aria-hidden="true">·</div>Just now</div></div></div></div></div><div class="ab co jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg"><div class="h k w fc fd q"><div class="kw l"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fc37ecf6456d9&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Friverbed-optimizing-data-access-at-airbnbs-scale-c37ecf6456d9&amp;user=Amre+Shakim&amp;userId=746b073212a9"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div></div><div><div class="bl" aria-hidden="false"></div></div><div class="ab q kh ki kj kk kl km kn ko kp kq kr ks kt ku kv"><div class="h k"><div><div class="bl" aria-hidden="false"></div></div><div class="fa sq cm"><div class="l ae"><div class="ab ca"><div class="sr ss st su sv no ch bg"><div class="ab"><div class="bl bg" aria-hidden="false"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div><div class="bl" aria-hidden="false" aria-describedby="postFooterSocialMenu" aria-labelledby="postFooterSocialMenu"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div><div class="ab ca mt mu mv mw" role="separator"><div class="gj gk gl gm gn"><div class="ab ca"><div class="ch bg fv fw fx fy"><figure class="ne nf ng nh ni nj nb nc paragraph-image"><div role="button" tabindex="0" class="nk nl ff nm bg nn"><div class="nb nc nd"><picture></picture></div></div></figure><p id="6524" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">An overview of Airbnb’s Data Framework for faster and more reliable read-heavy workloads.</p><blockquote class="oo op oq"><p id="f707" class="nq nr or ns b nt nu nv nw nx ny nz oa os oc od oe ot og oh oi ou ok ol om on gj bj"><strong class="ns gr">By:</strong> Sivakumar Bhavanari, Krish Chainani, Victor Chen, Yanxi Chen, Xiangmin Liang, Anton Panasenko, Sonia Stan, Peggy Zheng and Amre Shakim</p></blockquote><h1 id="92e3" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Overview</h1><p id="388a" class="pw-post-body-paragraph nq nr gq ns b nt pt nv nw nx pu nz oa ob pv od oe of pw oh oi oj px ol om on gj bj">The evolution of Airbnb and its tech stack calls for a scalable and reliable foundation that simplifies the access and processing of complex data sets. Enter Riverbed, a data framework designed for fast read performance and high availability. In this blog series, we will introduce Riverbed, highlighting its objectives, design, and features.</p><h1 id="710f" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Why was Riverbed Created</h1><p id="9937" class="pw-post-body-paragraph nq nr gq ns b nt pt nv nw nx pu nz oa ob pv od oe of pw oh oi oj px ol om on gj bj">The growth of Airbnb has accelerated the number of databases we operate, the variety of data types they serve, and the addition of data-intensive services accessing these databases, resulting in complex data infrastructure and a<a class="af py" rel="noopener" href="https://medium.com/airbnb-engineering/building-services-at-airbnb-part-1-c4c1d8fa811b"> Service-Oriented Architecture</a> (SOA) that is difficult to manage.</p><figure class="ne nf ng nh ni nj nb nc paragraph-image"><div role="button" tabindex="0" class="nk nl ff nm bg nn"><div class="nb nc pz"><picture></picture></div></div></figure><p id="3159" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj"><strong class="ns gr">Figure 1. Airbnb SOA dependency graph</strong></p><p id="c61c" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">We have noticed a specific pattern of queries that involve accessing multiple data sources, have complicated hydration business logic, and involve complex data transformations that are difficult to optimize. Airbnb workloads heavily utilize these queries on the read path, which exacerbates performance issues.</p><p id="ce87" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">Let’s examine how Airbnb’s payment system faced challenges after transitioning from a monolith to SOA. The payment system at Airbnb is complex and involves accessing multiple data sources while requiring complex business logic to compute fees, transaction dates, currencies, amounts, and total earnings. However, after their SOA migration, the data needed for these calculations became scattered across various services and tables. This made it challenging to provide all the necessary information in a simple and performant manner, particularly for read-heavy requests. To learn more about these and other challenges, we recommend reading <a class="af py" rel="noopener" href="https://medium.com/airbnb-engineering/unified-payments-data-read-at-airbnb-e613e7af1a39">this</a> blog post.</p><p id="810d" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">One possible solution is to register most frequented queries, pre-compute the denormalized payment data, and provide a table to store the computed results, making them optimized for read-heavy requests. This is known as a materialized view, and is provided as a built-in functionality by many databases.</p><p id="a7b8" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">In an SOA environment where data is distributed across multiple databases, the views we create depend on data from various sources. This technique is widely adopted in industry and usually implemented using a combination of Change-Data-Capture (CDC), stream processing, and a database to persist the final results.</p><p id="9483" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">Lambda and Kappa are two real-time data processing architectures. Lambda combines batch and real-time processing for efficient handling of large data volumes, while Kappa focuses solely on streaming processing. Kappa’s simplicity offers better maintainability, but it poses challenges for implementing backfill mechanisms and ensuring data consistency, especially with out-of-order events.</p><p id="e438" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">To address these challenges and simplify the construction and management of distributed materialized views, we developed Riverbed. Riverbed is a Lambda-like data framework that abstracts the complexities of maintaining materialized views, enabling faster product iterations. In the following sections, we will discuss Riverbed’s design choices and the tradeoffs made to achieve high performance, reliability, and consistency goals.</p><h1 id="28b1" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Riverbed Design</h1><h1 id="509c" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Overview</h1><p id="fe0c" class="pw-post-body-paragraph nq nr gq ns b nt pt nv nw nx pu nz oa ob pv od oe of pw oh oi oj px ol om on gj bj">At a high level, Riverbed adopts Lambda architecture that consists of an online component for processing real-time event changes and an offline component for filling missing data. Riverbed provides a declarative interface for product engineers to define the queries and implement the business logic for computation using GraphQL for both the online and offline components. Under the hood, the framework efficiently executes the queries, computes the derived data and eventually writes to one or multiple designated sink(s). Riverbed handles the heavy lifting of some common challenges of data intensive systems, such as concurrent writes, versioning, integrations with various infrastructure components at Airbnb, data correctness guarantees, and ultimately enables the product teams to quickly iterate on product features.</p><h1 id="c8f4" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Streaming system</h1><figure class="ne nf ng nh ni nj nb nc paragraph-image"><div role="button" tabindex="0" class="nk nl ff nm bg nn"><div class="nb nc qa"><picture></picture></div></div></figure><p id="5cc9" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj"><strong class="ns gr">Figure 2. Streaming system</strong></p><p id="56f3" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">The streaming system’s primary function is to address the incremental view materialization problem that arises when changes are made to system-of-record tables. To achieve this, the system consumes Change-Data-Capture (CDC) events via a <a class="af py" rel="noopener" href="https://medium.com/airbnb-engineering/capturing-data-evolution-in-a-service-oriented-architecture-72f7c643ee6f">Kafka-based system</a>. It converts these events into “notification” triggers, which are associated with specific document IDs in the sink. A “notification” trigger serves as a signal to refresh a particular document. This process occurs in a highly-parallel manner with out-of-order, batched consumers. Within each batch, notification triggers are deduplicated before being written to Kafka.</p><p id="0071" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">A second process consumes the earlier produced “notification” triggers. Using a series of joins, data stitching, and executing user-specified operations, the “notifications” are transformed into a document. The resulting document is then drained into the designated sink. Whenever a change occurs on a system-of-record table, the system replaces the affected document with a more up-to-date version, ensuring eventual consistency.</p><h1 id="6a79" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Batch system</h1><p id="6b09" class="pw-post-body-paragraph nq nr gq ns b nt pt nv nw nx pu nz oa ob pv od oe of pw oh oi oj px ol om on gj bj">There is still a possibility of occasional event loss throughout the pipeline or due to bugs, such as in CDC. Recognizing the need to address these potential inconsistencies, we implemented a batch system that reconciles missing events occurring from online streaming changes. This process helps to identify only the changed data in terms of the materialized view document and provides a mechanism for bootstrapping the materialized view through a backfill. However, reading and processing large volumes of data from online sources may pose performance bottlenecks and potential heterogeneity issues, making direct backfills or reconciliation from these sources infeasible.</p><p id="3dbe" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">To overcome these challenges, Riverbed leverages Apache Spark within its backfilling or reconciliation pipelines, taking advantage of the daily snapshots stored in the offline data warehouse. The framework generates Spark SQL based on GraphQL queries created by clients. Using the data from the warehouse, Riverbed re-uses the same business logic from the streaming system to transform the data and write to sinks.</p><figure class="ne nf ng nh ni nj nb nc paragraph-image"><div role="button" tabindex="0" class="nk nl ff nm bg nn"><div class="nb nc qa"><picture></picture></div></div></figure><p id="843d" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj"><strong class="ns gr">Figure 3. Batch system</strong></p><h1 id="5551" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Concurrency/versioning</h1><p id="139f" class="pw-post-body-paragraph nq nr gq ns b nt pt nv nw nx pu nz oa ob pv od oe of pw oh oi oj px ol om on gj bj">In any distributed system, concurrent updates can cause race conditions that result in incorrect or inconsistent data. Riverbed avoids race conditions by serializing all changes for a given document using Kafka. Incoming source mutations are first converted to intermediate events only containing the sink document ID and are written to Kafka, then a secondary (notification) process consumes these intermediate events, materializes and writes them to the sink. Because the intermediate Kafka topic is partitioned by the document ID of the event, all documents with the same document ID will be processed serially by the same consumer, avoiding the problem of race conditions from parallel real-time streaming writes altogether.</p><p id="fb2a" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">To solve for parallel writes between real-time streaming and offline jobs, we store a version based on timestamps in the sink. Each sink type is required to only allow writes if the version is greater than or equal to the current version, which solves for race conditions between streaming and batch systems.</p><p id="6c57" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">Conceptually, Riverbed views each mutation as a hint of a change. The processor always uses data from the source of truth, and hence will produce sink documents in the latest consistent state as of the time of processing. Now processing of events is idempotent and can be done any number of times and in any order.</p><h1 id="7434" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Results</h1><p id="acac" class="pw-post-body-paragraph nq nr gq ns b nt pt nv nw nx pu nz oa ob pv od oe of pw oh oi oj px ol om on gj bj">Riverbed has had a broad impact across Airbnb. It currently processes 2.4B events and writes 350M documents on a daily basis, and powers 50+ materialized views across Airbnb. Riverbed helps power features such as payments, search within messages, review rendering on the listing page, and many other features around co-hosting, itineraries, and internal facing products.</p><h1 id="86f3" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Summary and Next Steps</h1><p id="ceec" class="pw-post-body-paragraph nq nr gq ns b nt pt nv nw nx pu nz oa ob pv od oe of pw oh oi oj px ol om on gj bj">In conclusion, Riverbed provides a scalable and high-performance data framework that improves the efficiency of read-heavy workloads. Riverbed’s design choices provide a declarative interface for product engineers, efficient execution of queries, and data correctness guarantees. This simplifies the construction and management of distributed materialized views and enables product teams to quickly iterate on features. Using Riverbed for pre-computing views of data has already resulted in significant latency improvements and improved reliability of the flow, ensuring a faster and more reliable experience for Airbnb’s Host and Guest communities.</p><p id="8db9" class="pw-post-body-paragraph nq nr gq ns b nt nu nv nw nx ny nz oa ob oc od oe of og oh oi oj ok ol om on gj bj">In future posts, we will explore different aspects of Riverbed in greater detail, including its design considerations, performance optimizations, and future development directions.</p><h1 id="f091" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">Acknowledgments</h1><p id="bd8a" class="pw-post-body-paragraph nq nr gq ns b nt pt nv nw nx pu nz oa ob pv od oe of pw oh oi oj px ol om on gj bj">All of this has been a significant collective effort from the team and any discussion of Read-Optimized Stores would not be complete without acknowledging the invaluable contributions of everyone on the team, both past and present. Big thanks to Will Moss, Krish Chainani, Victor Chen, Sonia Stan, Xiangmin Liang, Siva Bhavanari, Peggy Zheng, Yanxi Chen on the development team; support from Juan Tamayo, Zoran Dimitrijevic, Zheng Liu, Chandramouli Rangarajan and leadership from Amre Shakim, Jessica Tai, Parth Shah, Adam Kocoloski, Abhishek Parmar, Bill Farner and Usman Abbasi. Last but not least, we would like to extend our sincere gratitude to Shylaja Ramachandra, Lauren Mackevich and Tina Nguyen for their invaluable assistance in editing and publishing this post. Their contributions have greatly improved the quality and clarity of the content.</p><h1 id="20eb" class="ov ow gq be ox oy oz pa pb pc pd pe pf pg ph pi pj pk pl pm pn po pp pq pr ps bj">****************</h1><p id="0261" class="pw-post-body-paragraph nq nr gq ns b nt pt nv nw nx pu nz oa ob pv od oe of pw oh oi oj px ol om on gj bj"><em class="or">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div></div></div></div></div></section></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Chronon — A Declarative Feature Engineering Framework" rel="noopener follow" href="https://medium.com/airbnb-engineering/chronon-a-declarative-feature-engineering-framework-b7b8ce796e04"><div class="vd ve vf vg vh"><img alt="Chronon — A Declarative Feature Engineering Framework" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/0*zVx5nqX7ADS6dRuO" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@nikhilsimha"><div class="l ff"><img alt="Nikhil Simha" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*OptNhEWMXznheH1glDcz6A.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@nikhilsimha"><p class="be b du z jh ji jj jk jl jm jn jo bj">Nikhil Simha</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/chronon-a-declarative-feature-engineering-framework-b7b8ce796e04"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">Chronon — A Declarative Feature Engineering Framework</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">A framework for developing production grade features for machine learning models. The purpose of the blog is to provide an overview of…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/chronon-a-declarative-feature-engineering-framework-b7b8ce796e04"><div class="ab q">8 min read·Jul 11</div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fb7b8ce796e04&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fchronon-a-declarative-feature-engineering-framework-b7b8ce796e04&amp;user=Nikhil+Simha&amp;userId=e32eea7118ed"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/chronon-a-declarative-feature-engineering-framework-b7b8ce796e04?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">1</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Metis: Building Airbnb’s Next Generation Data Management Platform" rel="noopener follow" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19"><div class="vd ve vf vg vh"><img alt="Metis: Building Airbnb’s Next Generation Data Management Platform" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/1*HC0FvtGMOcsE3d138wRDRw.jpeg" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@zhengxb2005"><div class="l ff"><img alt="Xiaobin Zheng" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*XRNumcimpnyRzc4W" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@zhengxb2005"><p class="be b du z jh ji jj jk jl jm jn jo bj">Xiaobin Zheng</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">Metis: Building Airbnb’s Next Generation Data Management Platform</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">How Airbnb evolved our data catalog into a platform for managing and governing our data warehouse at scale.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19"><div class="ab q">8 min read·Jun 8</div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fd2c5219edf19&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmetis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19&amp;user=Xiaobin+Zheng&amp;userId=7fb3360a60d3"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">2</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Improving Performance with HTTP Streaming" rel="noopener follow" href="https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408"><div class="vd ve vf vg vh"><img alt="Improving Performance with HTTP Streaming" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/1*q2A2ZjnULygCKIWuiSBKXg.jpeg" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@hey.victor"><div class="l ff"><img alt="Victor" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*02446_XHOIpleO6C7SvQ4A.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@hey.victor"><p class="be b du z jh ji jj jk jl jm jn jo bj">Victor</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">Improving Performance with HTTP Streaming</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">How HTTP Streaming can improve page performance and how Airbnb enabled it on an existing codebase</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408"><div class="ab q">7 min read·May 17</div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fba9e72c66408&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fimproving-performance-with-http-streaming-ba9e72c66408&amp;user=Victor&amp;userId=e46fded15590"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">16</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Sunsetting React Native" rel="noopener follow" href="https://medium.com/airbnb-engineering/sunsetting-react-native-1868ba28e30a"><div class="vd ve vf vg vh"><img alt="Sunsetting React Native" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/1*8c-9hgBkRGcllO9CHcTzbQ.jpeg" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@gpeal"><div class="l ff"><img alt="Gabriel Peal" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*-gkqq0poEnrA4DuUxEawCQ.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@gpeal"><p class="be b du z jh ji jj jk jl jm jn jo bj">Gabriel Peal</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/sunsetting-react-native-1868ba28e30a"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">Sunsetting React Native</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">Due to a variety of technical and organizational issues, we will be sunsetting React Native and putting all of our efforts into making…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/sunsetting-react-native-1868ba28e30a"><div class="ab q">6 min read·Jun 19, 2018</div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F1868ba28e30a&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fsunsetting-react-native-1868ba28e30a&amp;user=Gabriel+Peal&amp;userId=bfa26a83c4b6"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/sunsetting-react-native-1868ba28e30a?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">52</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="A batch of requests from client to server limited by the rate limiter" rel="noopener follow" href="https://medium.com/design-bootcamp/the-world-of-rate-limit-algorithms-54fb9078e90a"><div class="vd ve vf vg vh"><img alt="A batch of requests from client to server limited by the rate limiter" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/1*3bpJGXivbtV4MenqCqYOIw.png" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ethi"><div class="l ff"><img alt="Ethiraj Srinivasan" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*vYcYxhf18omazKTl.jpg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@ethi"><p class="be b du z jh ji jj jk jl jm jn jo bj">Ethiraj Srinivasan</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/design-bootcamp" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Bootcamp</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/design-bootcamp/the-world-of-rate-limit-algorithms-54fb9078e90a"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">The world of Rate Limit Algorithms</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">Rate limit algorithm is a mechanism used to control the rate of requests or messages being sent or received by a system or service.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/design-bootcamp/the-world-of-rate-limit-algorithms-54fb9078e90a"><div class="ab q"><div class="qs ab"><div class="bl" aria-hidden="false"></div>·14 min read·Mar 9</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fdesign-bootcamp%2F54fb9078e90a&amp;operation=register&amp;redirect=https%3A%2F%2Fbootcamp.uxdesign.cc%2Fthe-world-of-rate-limit-algorithms-54fb9078e90a&amp;user=Ethiraj+Srinivasan&amp;userId=31fe7ab6cdb3"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Microservices: The Million-Dollar Mistake Your Company is Making" rel="noopener follow" href="https://medium.com/gitconnected/microservices-the-million-dollar-mistake-your-company-is-making-c50eb428f732"><div class="vd ve vf vg vh"><img alt="Microservices: The Million-Dollar Mistake Your Company is Making" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/0*gee2b5NEwlWViv8j" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rico-fritzsche"><div class="l ff"><img alt="Rico Fritzsche" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*Y69NVWERXzc5GIVFDCMvgQ.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@rico-fritzsche"><p class="be b du z jh ji jj jk jl jm jn jo bj">Rico Fritzsche</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/gitconnected" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Level Up Coding</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/gitconnected/microservices-the-million-dollar-mistake-your-company-is-making-c50eb428f732"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">Microservices: The Million-Dollar Mistake Your Company is Making</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">In the middle of the last decade, as I was elbow-deep in code, tinkering with concepts like service discovery, a buzzword began to echo…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/gitconnected/microservices-the-million-dollar-mistake-your-company-is-making-c50eb428f732"><div class="ab q"><div class="qs ab"><div class="bl" aria-hidden="false"></div>·5 min read·Jul 11</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fgitconnected%2Fc50eb428f732&amp;operation=register&amp;redirect=https%3A%2F%2Flevelup.gitconnected.com%2Fmicroservices-the-million-dollar-mistake-your-company-is-making-c50eb428f732&amp;user=Rico+Fritzsche&amp;userId=f97b9ef20eda"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/gitconnected/microservices-the-million-dollar-mistake-your-company-is-making-c50eb428f732?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">20</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="System Design Blueprint: The Ultimate Guide" rel="noopener follow" href="https://medium.com/bytebytego-system-design-alliance/system-design-blueprint-the-ultimate-guide-e27b914bf8f1"><div class="vd ve vf vg vh"><img alt="System Design Blueprint: The Ultimate Guide" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/1*QSFihi7zXbR5X915MDmKyQ.png" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@zonito"><div class="l ff"><img alt="Love Sharma" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*gv7c1x--XZDXDOIuS2re6g.png" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@zonito"><p class="be b du z jh ji jj jk jl jm jn jo bj">Love Sharma</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/bytebytego-system-design-alliance" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">ByteByteGo System Design Alliance</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/bytebytego-system-design-alliance/system-design-blueprint-the-ultimate-guide-e27b914bf8f1"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">System Design Blueprint: The Ultimate Guide</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">Developing a robust, scalable, and efficient system can be daunting. However, understanding the key concepts and components can make the…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/bytebytego-system-design-alliance/system-design-blueprint-the-ultimate-guide-e27b914bf8f1"><div class="ab q"><div class="qs ab"><div class="bl" aria-hidden="false"></div>·9 min read·Apr 20</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fbytebytego-system-design-alliance%2Fe27b914bf8f1&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fbytebytego-system-design-alliance%2Fsystem-design-blueprint-the-ultimate-guide-e27b914bf8f1&amp;user=Love+Sharma&amp;userId=297e16e76b8"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/bytebytego-system-design-alliance/system-design-blueprint-the-ultimate-guide-e27b914bf8f1?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">53</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="ElasticSearch V.S. MySQL" rel="noopener follow" href="https://medium.com/thedevproject/elasticsearch-v-s-mysql-b48c7daf2afa"><div class="vd ve vf vg vh"><img alt="ElasticSearch V.S. MySQL" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/0*vZpV1s9hhQ5N7Jkf" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@jinlow"><div class="l ff"><img alt="JIN" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*p1kSL0td0AI_d1um7RSJog.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@jinlow"><p class="be b du z jh ji jj jk jl jm jn jo bj">JIN</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/thedevproject" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Dev Project</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/thedevproject/elasticsearch-v-s-mysql-b48c7daf2afa"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">ElasticSearch V.S. MySQL</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">The Comparison</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/thedevproject/elasticsearch-v-s-mysql-b48c7daf2afa"><div class="ab q"><div class="qs ab"><div class="bl" aria-hidden="false"></div>·6 min read·Feb 9</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fthedevproject%2Fb48c7daf2afa&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fthedevproject%2Felasticsearch-v-s-mysql-b48c7daf2afa&amp;user=JIN&amp;userId=950e50ef0ea5"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Microsoft Just Showed us the Future of ChatGPT with LongNet" rel="noopener follow" href="https://medium.com/@ignacio.de.gregorio.noblejas/microsoft-just-showed-us-the-future-of-chatgpt-with-longnet-124e106cbd6c"><div class="vd ve vf vg vh"><img alt="Microsoft Just Showed us the Future of ChatGPT with LongNet" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/1*JgyHs9hhwKOBKpNW2O8ExA.png" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ignacio.de.gregorio.noblejas"><div class="l ff"><img alt="Ignacio de Gregorio" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*p6kCCpNZARkVEYv4OCH7GQ@2x.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@ignacio.de.gregorio.noblejas"><p class="be b du z jh ji jj jk jl jm jn jo bj">Ignacio de Gregorio</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ignacio.de.gregorio.noblejas/microsoft-just-showed-us-the-future-of-chatgpt-with-longnet-124e106cbd6c"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">Microsoft Just Showed us the Future of ChatGPT with LongNet</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">Let’s talk about Billions</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@ignacio.de.gregorio.noblejas/microsoft-just-showed-us-the-future-of-chatgpt-with-longnet-124e106cbd6c"><div class="ab q"><div class="qs ab"><div class="bl" aria-hidden="false"></div>·8 min read·5 days ago</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F124e106cbd6c&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40ignacio.de.gregorio.noblejas%2Fmicrosoft-just-showed-us-the-future-of-chatgpt-with-longnet-124e106cbd6c&amp;user=Ignacio+de+Gregorio&amp;userId=9b351e8113e9"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/@ignacio.de.gregorio.noblejas/microsoft-just-showed-us-the-future-of-chatgpt-with-longnet-124e106cbd6c?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">26</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qs l"><div class="bg dv"><div class="dv l"><div class="dv uo up uq ur us ut uu uv uw ux uy uz va"><div class="vb"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Small Teams &gt; Big Teams" rel="noopener follow" href="https://medium.com/startup-stash/small-teams-big-teams-cfe3a41daa6e"><div class="vd ve vf vg vh"><img alt="Small Teams &gt; Big Teams" class="bg vi vj vk vl bw" src="https://miro.medium.com/v2/resize:fit:1358/1*_BAs3vafZxZ6-hPhJTcxzA.png" /></div></a></div><div class="vc ab ca cn"><div class="vm vn vo vp vq ab"><div class="qe l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@tedbauer"><div class="l ff"><img alt="Ted Bauer" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*akJRYabZUZuBh0ks79dkNg.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@tedbauer"><p class="be b du z jh ji jj jk jl jm jn jo bj">Ted Bauer</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/startup-stash" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Startup Stash</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/startup-stash/small-teams-big-teams-cfe3a41daa6e"><div title=""><h2 class="be gr oy pa we wf pb pc pe wg wh pf ob wi wj wk wl of wm wn wo wp oj wq wr ws wt jh jj jk jm jo bj">Small Teams &gt; Big Teams</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">Ah, back to the “Pizza Rule?”</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/startup-stash/small-teams-big-teams-cfe3a41daa6e"><div class="ab q"><div class="qs ab"><div class="bl" aria-hidden="false"></div>·3 min read·6 days ago</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fstartup-stash%2Fcfe3a41daa6e&amp;operation=register&amp;redirect=https%3A%2F%2Fblog.startupstash.com%2Fsmall-teams-big-teams-cfe3a41daa6e&amp;user=Ted+Bauer&amp;userId=794a1c55cb75"><div class="lb ao lc ld le lf am lg lh li la"></div></a></div><div class="pw-multi-vote-count l lj lk ll lm ln lo lp"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al ls an ao ap aq ar as at lr ab q lt lu" aria-label="responses" rel="noopener follow" href="https://medium.com/startup-stash/small-teams-big-teams-cfe3a41daa6e?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">14</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article>]]></description>
      <link>https://medium.com/airbnb-engineering/riverbed-optimizing-data-access-at-airbnbs-scale-c37ecf6456d9</link>
      <guid>https://medium.com/airbnb-engineering/riverbed-optimizing-data-access-at-airbnbs-scale-c37ecf6456d9</guid>
      <pubDate>Tue, 25 Jul 2023 20:46:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Chronon — A Declarative Feature Engineering Framework]]></title>
      <description><![CDATA[<article><div class="l"><div class="l"><section><div><div class="gj gk gl gm gn"><div class="ab ca"><div class="ch bg fv fw fx fy"><div class=""><div class="hr hs ht hu hv"><div class="speechify-ignore ab co"><div class="speechify-ignore bg l"><div class="hw hx hy hz ia ab"><div><div class="ab ib"><a rel="noopener follow" href="https://medium.com/@nikhilsimha"><div><div class="bl" aria-hidden="false"><div class="l ic id bx ie if"><div class="l ff"><img alt="Nikhil Simha" class="l fa bx dc dd cw" src="https://miro.medium.com/v2/resize:fill:88:88/1*OptNhEWMXznheH1glDcz6A.jpeg" width="44" height="44" /></div></div></div></div></a><a href="https://medium.com/airbnb-engineering" rel="noopener follow"><div class="ij ab ff"><div><div class="bl" aria-hidden="false"><div class="l ik il bx ie im"><div class="l ff"><img alt="The Airbnb Tech Blog" class="l fa bx bq in cw" src="https://miro.medium.com/v2/resize:fill:48:48/1*MlNQKg-sieBGW5prWoe9HQ.jpeg" width="24" height="24" /></div></div></div></div></div></a></div></div><div class="bm bg l"><div class="ab"><div><div class="io ab q"><div class="ab q ip"><div class="ab q"><div><div class="bl" aria-hidden="false"><p class="be b iq ir bj"><a class="af ag ah ai aj ak al am an ao ap aq ar is" rel="noopener follow" href="https://medium.com/@nikhilsimha">Nikhil Simha</a></p></div></div></div>·<p class="be b iq ir dt"><a class="iv iw ah ai aj ak al am an ao ap aq ar eu ix iy" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2Fe32eea7118ed&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fchronon-a-declarative-feature-engineering-framework-b7b8ce796e04&amp;user=Nikhil+Simha&amp;userId=e32eea7118ed">Follow</a></p></div></div></div></div><div class="l iz"><div class="ab cm ja jb jc"><div class="jd je ab"><div class="be b bf z dt ab jf">Published in<div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b bf z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div><div class="h k">·</div></div><div class="ab ae">8 min read<div class="jp jq l" aria-hidden="true">·</div>Just now</div></div></div></div></div><div class="ab co jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg"><div class="h k w fc fd q"><div class="kw l"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fb7b8ce796e04&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fchronon-a-declarative-feature-engineering-framework-b7b8ce796e04&amp;user=Nikhil+Simha&amp;userId=e32eea7118ed"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div></div><div><div class="bl" aria-hidden="false"></div></div><div class="ab q kh ki kj kk kl km kn ko kp kq kr ks kt ku kv"><div class="h k"><div><div class="bl" aria-hidden="false"></div></div><div class="fa sp cm"><div class="l ae"><div class="ab ca"><div class="sq sr ss st su ne ch bg"><div class="ab"><div class="bl bg" aria-hidden="false"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div><div class="bl" aria-hidden="false" aria-describedby="postFooterSocialMenu" aria-labelledby="postFooterSocialMenu"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div><figure class="mu mv mw mx my mz mr ms paragraph-image"><div role="button" tabindex="0" class="na nb ff nc bg nd"><div class="mr ms mt"><picture></picture></div></div></figure><p id="00f3" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj"><strong class="ni gr">A framework for developing production grade features for machine learning models. The purpose of the blog is to provide an overview of core concepts in Chronon.</strong></p><p id="4977" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj"><a class="af oe" href="https://www.linkedin.com/in/nikhilsimha" rel="noopener ugc nofollow" target="_blank">Nikhil Simha Raprolu</a></p><h1 id="c99c" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Background</h1><p id="8a32" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">Airbnb uses machine learning in almost every product, from ranking search results to intelligently pricing listings and routing users to the right customer support agents.</p><p id="03c7" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">We noticed that feature management was a consistent pain point for the ML Engineers working on these projects. Rather than focusing on their models, they were spending a lot of their time gluing together other pieces of infrastructure to manage their feature data, and still encountering issues.</p><p id="87ff" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">One common issue arose from the log-and-wait approach to generating training data, where a user logs feature values from their serving endpoint, then waits to accumulate enough data to train a model. This wait period can be more than a year for models that need to capture seasonality. This was a major pain point for machine learning practitioners, hindering them from responding quickly to changing user behaviors and product demands.</p><p id="2119" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">A common approach to address this wait time is to transform raw data in the warehouse into training data using ETL jobs. However, users encountered a critical problem when they tried to launch their model to production — they needed to write complex streaming jobs or replicate ETL logic to serve their feature data, and often could not guarantee that the feature distribution for serving model inference was consistent with what they trained on. This training-serving skew led to hard-to-debug model degradation, and worse than expected model performance.</p><p id="d0dc" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">Chronon was built to address these pain points. It allows ML practitioners to define features and centralize the data computation for both model training and production inference, while guaranteeing consistency between the two.</p><h1 id="612d" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Introducing Chronon</h1><p id="77ef" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">This post is focused on the Chronon API and capabilities. At a high level, these include:</p><ul class=""><li id="9e15" class="ng nh gq ni b nj nk nl nm nn no np nq pi ns nt nu pj nw nx ny pk oa ob oc od pl pm pn bj"><strong class="ni gr">Ingesting data from a variety of sources</strong> — Event streams, fact/dim tables in warehouse, table snapshots, Slowly Changing Dimension tables, Change Data Streams, etc.</li><li id="8fa0" class="ng nh gq ni b nj po nl nm nn pp np nq pi pq nt nu pj pr nx ny pk ps ob oc od pl pm pn bj"><strong class="ni gr">Transforming that data</strong> — It supports standard SQL-like transformations as well as more powerful time-based aggregations.</li><li id="5e53" class="ng nh gq ni b nj po nl nm nn pp np nq pi pq nt nu pj pr nx ny pk ps ob oc od pl pm pn bj"><strong class="ni gr">Producing results both online and offline</strong> — <em class="pt">Online</em>, as low-latency end-points for feature serving, or <em class="pt">Offline</em> as Hive tables, for generating training data.</li><li id="bf86" class="ng nh gq ni b nj po nl nm nn pp np nq pi pq nt nu pj pr nx ny pk ps ob oc od pl pm pn bj"><strong class="ni gr">Flexible choice for updating results</strong> — You can choose whether the feature values are updated in real-time or at fixed intervals with an “Accuracy” parameter. This also ensures the same behavior even while backfilling.</li><li id="b665" class="ng nh gq ni b nj po nl nm nn pp np nq pi pq nt nu pj pr nx ny pk ps ob oc od pl pm pn bj"><strong class="ni gr">Using a powerful Python API — </strong>that treats time based aggregation and windowing as first-class concepts, along with familiar SQL primitives like Group-By, Join, Select etc, while retaining the full flexibility and composability offered by Python.</li></ul><figure class="pu pv pw px py mz mr ms paragraph-image"><div role="button" tabindex="0" class="na nb ff nc bg nd"><div class="mr ms mt"><picture></picture></div></div></figure><h1 id="f42d" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">API Overview</h1><p id="57ad" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">First, let’s start with an example. The code snippet computes the <em class="pt">number of times an item is viewed by a user in the last five hours from an activity stream</em>, while applying some additional transformations and filters. This uses concepts like GroupBy, Aggregation, EventSource etc.,.</p><p id="57c7" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">In the sections below we will demystify these concepts.</p><figure class="pu pv pw px py mz mr ms paragraph-image"><div role="button" tabindex="0" class="na nb ff nc bg nd"><div class="mr ms mt"><picture></picture></div></div></figure><h1 id="d6fe" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Understanding accuracy</h1><p id="e73f" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">Some use-cases require derived data to be as up-to-date as possible, while others allow for updating at a daily cadence. For example, understanding the intent of a user’s search session requires accounting for the latest user activity. To display revenue figures on a dashboard for human consumption, it is usually adequate to refresh the results in fixed intervals.</p><p id="f4f1" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">Chronon allows users to express whether a derivation needs to be updated in near real-time or in daily intervals by setting the <em class="pt">‘Accuracy’</em> of a computation — which can be either <em class="pt">‘Temporal’</em> or <em class="pt">‘Snapshot’</em>. In Chronon this accuracy applies both to online serving of data via low latency endpoints, and also offline backfilling via batch computation jobs.</p><h1 id="4e5e" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Understanding data sources</h1><p id="5771" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">Real world data is ingested into the data warehouse continuously. There are three kinds of ingestion patterns. In Chronon these ingestion patterns are specified by declaring the “type” of a data source.</p><h1 id="7823" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Event data sources</h1><p id="5e1d" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">Timestamped activity like views, clicks, sensor readings, stock prices etc — published into a data stream like Kafka.</p><p id="0463" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">In the data lake these events are stored in date-partitioned tables (Hive). Assuming timestamps are millisecond precise and the data ingestion is partition by date — a date partition ‘2023–07–04’, of click events contains click events that happened between ‘2023–07–04 00:00:00.000’ and ‘2023–07–04 23:59:59.999’. Users can configure the date partition based on your warehouse convention, once globally, as a Spark parameter.</p><blockquote class="pz qa qb"><p id="b72b" class="ng nh pt ni b nj nk nl nm nn no np nq pi ns nt nu pj nw nx ny pk oa ob oc od gj bj">— conf “spark.chronon.partition.column=date_key”</p></blockquote><p id="d4fc" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">In Chronon you can declare an EventSource by specifying two things, a <em class="pt">‘table’</em> (Hive) and optionally a <em class="pt">‘topic’</em> (Kafka). Chronon can use the <em class="pt">‘table’</em> to backfill data — with Temporal accuracy. When a <em class="pt">‘topic’</em> is provided, we can update a key-value store in real-time to serve fresh data to applications and ML models.</p><h1 id="8f1b" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Entity data sources</h1><p id="9414" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">Attribute metadata related to business entities. Few examples for a retail business would be, user information — with attributes like address, country etc., or item information — with attributes like price, available count etc. This data is usually served online via OLTP databases like MySQL to applications. These tables are snapshotted into the warehouse usually at daily intervals. So a ‘2023–07–04’ partition contains a snapshot of the item information table taken at ‘2023–07–04 23:59:59.999’.</p><p id="45a4" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">However these snapshots can only support <em class="pt">‘Snapshot’ </em>accurate computations but insufficient for <em class="pt">‘Temporal’ </em>accuracy. If you have a change data capture mechanism, Chronon can utilize the change data stream with table mutations to maintain a near real-time refreshed view of computations. If you also capture this change data stream in your warehouse, Chronon can backfill computations at historical points in time with <em class="pt">‘Temporal’ </em>accuracy.</p><p id="5aac" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">You can create an entity source by specifying three things: <em class="pt">‘snapshotTable’ </em>and optionally <em class="pt">‘mutationTable’</em> and <em class="pt">‘mutationTopic’ </em>for <em class="pt">‘Temporal’</em> accuracy. When you specify <em class="pt">‘mutationTopic’ — </em>the data stream with mutations corresponding to the entity, Chronon will be able to maintain a real-time updated view that can be read from in low latency. When you specify ‘<em class="pt">mutationTable</em>’, Chronon will be able to backfill data at historical points in time with millisecond precision.</p><h1 id="6619" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Cumulative Event Sources</h1><p id="7adf" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">This data model is typically used to capture history of values for slowly changing dimensions. Entries of the underlying database table are only ever inserted and never updated except for a surrogate (<a class="af oe" href="https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row" rel="noopener ugc nofollow" target="_blank">SCD2</a>).</p><p id="52db" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">They are also snapshotted into the data warehouse using the same mechanism as entity sources. But because they track all changes in the snapshot, just the latest partition is sufficient for backfilling computations. And no <em class="pt">‘mutationTable’ </em>is required.</p><p id="8a9d" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">In Chronon you can specify a Cumulative Event Source by creating an event source with <em class="pt">‘table’ </em>and <em class="pt">‘topic’ </em>as before, but also by enabling a flag <em class="pt">‘isCumulative’</em>. The <em class="pt">‘table’</em> is the snapshot of the online database table that serves application traffic. The <em class="pt">‘topic’</em> is the data stream containing all the insert events.</p><h1 id="ca69" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Understanding computation contexts</h1><p id="ac08" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">Chronon can compute in two contexts, online and offline with the same compute definition.</p><p id="2d44" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">Offline computation is done over warehouse datasets (Hive tables) using batch jobs. These jobs output new datasets. Chronon is designed to deal with datasets that change — newly arriving data into the warehouse as Hive table partitions.</p><p id="379c" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">Online, the usage is to serve application traffic in low latency(~10ms) at high QPS. Chronon maintains endpoints that serve features that are updated in real-time, by generating “lambda architecture” pipelines. You can set a parameter <em class="pt">“online = True”</em> in Python to enable this.</p><p id="9698" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">Under the hood, Chronon orchestrates pipelines using Kafka, Spark/Spark Streaming, Hive, Airflow and a customizable key-value store power serving and training data generation.</p><h1 id="77ae" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Understanding computation types</h1><p id="52cc" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">All chronon definitions fall into three categories — a GroupBy, Join or a StagingQuery.</p><p id="ca12" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj"><strong class="ni gr">GroupBy — </strong>is an aggregation primitive similar to SQL, with native support for windowed and bucketed aggregations. This supports computation in both online and offline contexts and in both accuracy models — Temporal (realtime refreshed) and Snapshot (daily refreshed). GroupBy has a notion of keys by which the aggregations are performed.</p><p id="7c6a" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj"><strong class="ni gr">Join — </strong>Joins together data from various GroupBy computations. In online mode, a join query containing keys, will be fanned out into queries per groupBy and external services and the results will be joined together and responded as a map. In offline mode, joins which can be thought of as a list of queries at historical points in time, against which the results need to be computed in a point-in-time correct fashion. If the left side is Entities, we always compute responses as of midnight.</p><p id="3d9e" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj"><strong class="ni gr">StagingQuery</strong> — allows for arbitrary computation expressed as Spark SQL query, that is computed offline daily. Chronon produces partitioned datasets. It is best suited for data pre or post processing.</p><h1 id="f4fc" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Understanding Aggregations</h1><p id="e252" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">GroupBys in Chronon essentially aggregate data by given keys. There are several extensions to the traditional SQL group-by that make Chronon aggregations powerful.</p><ol class=""><li id="9080" class="ng nh gq ni b nj nk nl nm nn no np nq pi ns nt nu pj nw nx ny pk oa ob oc od qc pm pn bj"><strong class="ni gr">Windows</strong> — Optionally, you can choose to aggregate only recent data within a <em class="pt">window</em> of time. This is critical for ML since un-windowed aggregations tend to grow and shift in their distributions, degrading model performance. It is also critical to place greater emphasis on recent events over very old events.</li><li id="d5dc" class="ng nh gq ni b nj po nl nm nn pp np nq pi pq nt nu pj pr nx ny pk ps ob oc od qc pm pn bj"><strong class="ni gr">Bucketing</strong> — Optionally you can also specify a second level of aggregation, on a <em class="pt">bucket</em> — besides the Group-By keys. The output of a bucketed aggregation is a column of map type containing the bucket column as keys and aggregates as value.</li><li id="9bc7" class="ng nh gq ni b nj po nl nm nn pp np nq pi pq nt nu pj pr nx ny pk ps ob oc od qc pm pn bj"><strong class="ni gr">Auto-unpack</strong> — If the input column contains data nested within an array, Chronon will automatically unpack.</li><li id="09eb" class="ng nh gq ni b nj po nl nm nn pp np nq pi pq nt nu pj pr nx ny pk ps ob oc od qc pm pn bj"><strong class="ni gr">Time based aggregations </strong>— like first_k, last_k, first, last etc when a timestamp is specified in the data source.</li></ol><p id="44f3" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">You can combine all of these options flexibly to define very powerful aggregations. Chronon internally maintains partial aggregates and combines them to produce features at different points-in-time. So using very large windows and backfilling training data for large date ranges is not a problem.</p><h1 id="354a" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Putting Everything together</h1><p id="cdd8" class="pw-post-body-paragraph ng nh gq ni b nj pd nl nm nn pe np nq nr pf nt nu nv pg nx ny nz ph ob oc od gj bj">As a user, you need to declare your computation only once, and Chronon will generate all the infrastructure needed to continuously turn raw data into features for both training and serving. ML practitioners at Airbnb no longer spend months trying to manually implement complex pipelines and feature indexes. They typically spend less than a week to generate new sets of features for their models.</p><p id="ab98" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj">Our core goal has been to make feature engineering as productive and as scalable as possible. Since the release of Chronon users have developed over ten thousand features powering ML models at Airbnb.</p><p id="ce2a" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj"><strong class="ni gr">Sponsors</strong>: <a class="af oe" href="mailto:david.nagle@airbnb.com" rel="noopener ugc nofollow" target="_blank">Dave Nagle</a> <a class="af oe" href="mailto:adam.kocoloski@airbnb.com" rel="noopener ugc nofollow" target="_blank">Adam Kocoloski</a> <a class="af oe" href="mailto:paul.ellwood@airbnb.com" rel="noopener ugc nofollow" target="_blank">Paul Ellwood</a> <a class="af oe" href="mailto:joy.zhang@airbnb.com" rel="noopener ugc nofollow" target="_blank">Joy Zhang</a> <a class="af oe" href="mailto:sanjeev.katariya@airbnb.com" rel="noopener ugc nofollow" target="_blank">Sanjeev Katariya</a> <a class="af oe" href="mailto:mukund.narasimhan@airbnb.com" rel="noopener ugc nofollow" target="_blank">Mukund Narasimhan</a> <a class="af oe" href="mailto:jack.song@airbnb.com" rel="noopener ugc nofollow" target="_blank">Jack Song</a> <a class="af oe" href="mailto:weiping.peng@airbnb.com" rel="noopener ugc nofollow" target="_blank">Weiping Peng</a> <a class="af oe" href="mailto:haichun.chen@airbnb.com" rel="noopener ugc nofollow" target="_blank">Haichun Chen</a> <a class="af oe" href="mailto:atul.kale@airbnb.com" rel="noopener ugc nofollow" target="_blank">Atul Kale</a></p><p id="aedf" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj"><strong class="ni gr">Contributors</strong>: <a class="af oe" href="mailto:varant.zanoyan@airbnb.com" rel="noopener ugc nofollow" target="_blank">Varant Zanoyan</a> <a class="af oe" href="mailto:pengyu.hou@airbnb.com" rel="noopener ugc nofollow" target="_blank">Pengyu Hou</a> <a class="af oe" href="mailto:cristian.figueroa@airbnb.com" rel="noopener ugc nofollow" target="_blank">Cristian Figueroa</a> <a class="af oe" href="mailto:haozhen.ding@airbnb.com" rel="noopener ugc nofollow" target="_blank">Haozhen Ding</a> <a class="af oe" href="mailto:sophie.wang@airbnb.com" rel="noopener ugc nofollow" target="_blank">Sophie Wang</a> <a class="af oe" href="mailto:vamsee.y@airbnb.com" rel="noopener ugc nofollow" target="_blank">Vamsee Yarlagadda</a> <a class="af oe" href="mailto:evgeny.shapiro@airbnb.com" rel="noopener ugc nofollow" target="_blank">Evgenii Shapiro</a> <a class="af oe" href="https://www.linkedin.com/in/patyoon/" rel="noopener ugc nofollow" target="_blank">Patrick Yoon</a></p><p id="c214" class="pw-post-body-paragraph ng nh gq ni b nj nk nl nm nn no np nq nr ns nt nu nv nw nx ny nz oa ob oc od gj bj"><strong class="ni gr">Partners</strong>: <a class="af oe" href="mailto:navjot.sidhu@airbnb.com" rel="noopener ugc nofollow" target="_blank">Navjot Sidhu</a> <a class="af oe" href="mailto:xin.liu@airbnb.com" rel="noopener ugc nofollow" target="_blank">Xin Liu</a> <a class="af oe" href="mailto:soren.telfer@airbnb.com" rel="noopener ugc nofollow" target="_blank">Soren Telfer</a> <a class="af oe" href="mailto:tom.benner@airbnb.com" rel="noopener ugc nofollow" target="_blank">Tom Benner</a> <a class="af oe" href="mailto:wael.mahmoud@airbnb.com" rel="noopener ugc nofollow" target="_blank">Wael Mahmoud</a> <a class="af oe" href="mailto:zach.fein@airbnb.com" rel="noopener ugc nofollow" target="_blank">Zach Fein</a> <a class="af oe" href="mailto:ben.mendler@airbnb.com" rel="noopener ugc nofollow" target="_blank">Ben Mendler</a> <a class="af oe" href="mailto:michael.sestito@airbnb.com" rel="noopener ugc nofollow" target="_blank">Michael Sestito</a> <a class="af oe" href="mailto:yinhe.cheng@airbnb.com" rel="noopener ugc nofollow" target="_blank">Yinhe Cheng</a> <a class="af oe" href="mailto:tianxiang.chen@airbnb.com" rel="noopener ugc nofollow" target="_blank">Tianxiang Chen</a> <a class="af oe" href="mailto:jie.tang@airbnb.com" rel="noopener ugc nofollow" target="_blank">Jie Tang</a> <a class="af oe" href="mailto:austin.chan@airbnb.com" rel="noopener ugc nofollow" target="_blank">Austin Chan</a> <a class="af oe" href="mailto:moose.abdool@airbnb.com" rel="noopener ugc nofollow" target="_blank">Moose Abdool</a> <a class="af oe" href="mailto:kedar.bellare@airbnb.com" rel="noopener ugc nofollow" target="_blank">Kedar Bellare</a> <a class="af oe" href="mailto:mia.zhao@airbnb.com" rel="noopener ugc nofollow" target="_blank">Mia Zhao</a> <a class="af oe" href="mailto:yang.qi@airbnb.com" rel="noopener ugc nofollow" target="_blank">Yang Qi</a> <a class="af oe" href="mailto:kosta.ristovski@airbnb.com" rel="noopener ugc nofollow" target="_blank">Kosta Ristovski</a> <a class="af oe" href="mailto:lior.malka@airbnb.com" rel="noopener ugc nofollow" target="_blank">Lior Malka</a> <a class="af oe" href="mailto:david.staub@airbnb.com" rel="noopener ugc nofollow" target="_blank">David Staub</a> <a class="af oe" href="mailto:chandramouli.rangarajan@airbnb.com" rel="noopener ugc nofollow" target="_blank">Chandramouli Rangarajan</a> <a class="af oe" href="mailto:guang.yang@airbnb.com" rel="noopener ugc nofollow" target="_blank">Guang Yang</a> <a class="af oe" href="mailto:jian.chen@airbnb.com" rel="noopener ugc nofollow" target="_blank">Jian Chen</a></p></div></div></div></div></div></div></div></div></section></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Metis: Building Airbnb’s Next Generation Data Management Platform" rel="noopener follow" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19"><div class="vc vd ve vf vg"><img alt="Metis: Building Airbnb’s Next Generation Data Management Platform" class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/1*HC0FvtGMOcsE3d138wRDRw.jpeg" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@zhengxb2005"><div class="l ff"><img alt="Xiaobin Zheng" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*XRNumcimpnyRzc4W" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@zhengxb2005"><p class="be b du z jh ji jj jk jl jm jn jo bj">Xiaobin Zheng</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19"><div title=""><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">Metis: Building Airbnb’s Next Generation Data Management Platform</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">How Airbnb evolved our data catalog into a platform for managing and governing our data warehouse at scale.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19"><div class="ab q">8 min read·Jun 8</div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fd2c5219edf19&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmetis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19&amp;user=Xiaobin+Zheng&amp;userId=7fb3360a60d3"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">2</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Improving Performance with HTTP Streaming" rel="noopener follow" href="https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408"><div class="vc vd ve vf vg"><img alt="Improving Performance with HTTP Streaming" class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/1*q2A2ZjnULygCKIWuiSBKXg.jpeg" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@hey.victor"><div class="l ff"><img alt="Victor" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*02446_XHOIpleO6C7SvQ4A.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@hey.victor"><p class="be b du z jh ji jj jk jl jm jn jo bj">Victor</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408"><div title=""><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">Improving Performance with HTTP Streaming</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">How HTTP Streaming can improve page performance and how Airbnb enabled it on an existing codebase</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408"><div class="ab q">7 min read·May 17</div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fba9e72c66408&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fimproving-performance-with-http-streaming-ba9e72c66408&amp;user=Victor&amp;userId=e46fded15590"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">16</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Sunsetting React Native" rel="noopener follow" href="https://medium.com/airbnb-engineering/sunsetting-react-native-1868ba28e30a"><div class="vc vd ve vf vg"><img alt="Sunsetting React Native" class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/1*8c-9hgBkRGcllO9CHcTzbQ.jpeg" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@gpeal"><div class="l ff"><img alt="Gabriel Peal" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*-gkqq0poEnrA4DuUxEawCQ.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@gpeal"><p class="be b du z jh ji jj jk jl jm jn jo bj">Gabriel Peal</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/sunsetting-react-native-1868ba28e30a"><div title=""><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">Sunsetting React Native</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">Due to a variety of technical and organizational issues, we will be sunsetting React Native and putting all of our efforts into making…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/sunsetting-react-native-1868ba28e30a"><div class="ab q">6 min read·Jun 19, 2018</div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F1868ba28e30a&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fsunsetting-react-native-1868ba28e30a&amp;user=Gabriel+Peal&amp;userId=bfa26a83c4b6"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/sunsetting-react-native-1868ba28e30a?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">52</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="A Deep Dive into Airbnb’s Server-Driven UI System" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5"><div class="vc vd ve vf vg"><img alt="A Deep Dive into Airbnb’s Server-Driven UI System" class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/0*CedYKpSYMIGEiX7m" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rbro112"><div class="l ff"><img alt="Ryan Brooks" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*py_8uAIKHqAuW89G5PgOeQ.png" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@rbro112"><p class="be b du z jh ji jj jk jl jm jn jo bj">Ryan Brooks</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5"><div title=""><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">A Deep Dive into Airbnb’s Server-Driven UI System</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">How Airbnb ships features faster across web, iOS, and Android using a server-driven UI system named Ghost Platform ?.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5"><div class="ab q">11 min read·Jun 29, 2021</div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F842244c5f5&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fa-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5&amp;user=Ryan+Brooks&amp;userId=4c31895f4c38"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">35</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="A 3D vector space illustrated as space with stars and a rocket." rel="noopener follow" href="https://medium.com/towards-data-science/explaining-vector-databases-in-3-levels-of-difficulty-fc392e48ab78"><div class="vc vd ve vf vg"><img alt="A 3D vector space illustrated as space with stars and a rocket." class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/1*uaQ93Hq08XZMHAin-HTcSg@2x.jpeg" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@iamleonie"><div class="l ff"><img alt="Leonie Monigatti" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*TTIl4oynrJyfIkLbC6fumA.png" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@iamleonie"><p class="be b du z jh ji jj jk jl jm jn jo bj">Leonie Monigatti</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/towards-data-science" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Towards Data Science</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/towards-data-science/explaining-vector-databases-in-3-levels-of-difficulty-fc392e48ab78"><div title=""><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">Explaining Vector Databases in 3 Levels of Difficulty</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">From noob to expert: Demystifying vector databases across different backgrounds</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/towards-data-science/explaining-vector-databases-in-3-levels-of-difficulty-fc392e48ab78"><div class="ab q"><div class="qv ab"><div class="bl" aria-hidden="false"></div>·8 min read·Jul 4</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Ftowards-data-science%2Ffc392e48ab78&amp;operation=register&amp;redirect=https%3A%2F%2Ftowardsdatascience.com%2Fexplaining-vector-databases-in-3-levels-of-difficulty-fc392e48ab78&amp;user=Leonie+Monigatti&amp;userId=3a38da70d8dc"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/towards-data-science/explaining-vector-databases-in-3-levels-of-difficulty-fc392e48ab78?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">11</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="System Design Blueprint: The Ultimate Guide" rel="noopener follow" href="https://medium.com/bytebytego-system-design-alliance/system-design-blueprint-the-ultimate-guide-e27b914bf8f1"><div class="vc vd ve vf vg"><img alt="System Design Blueprint: The Ultimate Guide" class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/1*QSFihi7zXbR5X915MDmKyQ.png" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@zonito"><div class="l ff"><img alt="Love Sharma" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*gv7c1x--XZDXDOIuS2re6g.png" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@zonito"><p class="be b du z jh ji jj jk jl jm jn jo bj">Love Sharma</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/bytebytego-system-design-alliance" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">ByteByteGo System Design Alliance</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/bytebytego-system-design-alliance/system-design-blueprint-the-ultimate-guide-e27b914bf8f1"><div title=""><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">System Design Blueprint: The Ultimate Guide</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">Developing a robust, scalable, and efficient system can be daunting. However, understanding the key concepts and components can make the…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/bytebytego-system-design-alliance/system-design-blueprint-the-ultimate-guide-e27b914bf8f1"><div class="ab q"><div class="qv ab"><div class="bl" aria-hidden="false"></div>·9 min read·Apr 20</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fbytebytego-system-design-alliance%2Fe27b914bf8f1&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fbytebytego-system-design-alliance%2Fsystem-design-blueprint-the-ultimate-guide-e27b914bf8f1&amp;user=Love+Sharma&amp;userId=297e16e76b8"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/bytebytego-system-design-alliance/system-design-blueprint-the-ultimate-guide-e27b914bf8f1?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">48</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="All You Need to Know to Build Your First LLM App" rel="noopener follow" href="https://medium.com/towards-data-science/all-you-need-to-know-to-build-your-first-llm-app-eb982c78ffac"><div class="vc vd ve vf vg"><img alt="All You Need to Know to Build Your First LLM App" class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/1*njagJOgiT-VTJjQ18bugcw.png" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@dmnkplzr"><div class="l ff"><img alt="Dominik Polzer" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*KqpicOFO7jh7FXGjoJ2Bcg.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@dmnkplzr"><p class="be b du z jh ji jj jk jl jm jn jo bj">Dominik Polzer</p></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/towards-data-science" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Towards Data Science</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/towards-data-science/all-you-need-to-know-to-build-your-first-llm-app-eb982c78ffac"><div title=""><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">All You Need to Know to Build Your First LLM App</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">A step-by-step tutorial to document loaders, embeddings, vector stores and prompt templates</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/towards-data-science/all-you-need-to-know-to-build-your-first-llm-app-eb982c78ffac"><div class="ab q"><div class="qv ab"><div class="bl" aria-hidden="false"></div>·25 min read·Jun 22</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Ftowards-data-science%2Feb982c78ffac&amp;operation=register&amp;redirect=https%3A%2F%2Ftowardsdatascience.com%2Fall-you-need-to-know-to-build-your-first-llm-app-eb982c78ffac&amp;user=Dominik+Polzer&amp;userId=3ab8d3143e32"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/towards-data-science/all-you-need-to-know-to-build-your-first-llm-app-eb982c78ffac?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">19</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Confirmed: Code Coverage Is a Useless Management Metric" rel="noopener follow" href="https://medium.com/@drpicox/confirmed-code-coverage-is-a-useless-management-metric-35afa05e8549"><div class="vc vd ve vf vg"><img alt="Confirmed: Code Coverage Is a Useless Management Metric" class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/1*-X6ky5pK1Kp5Fw2XbAgxYw.png" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@drpicox"><div class="l ff"><img alt="David Rodenas, Ph. D." class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*6dwaxm2q7mQGLyRAMb455w.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@drpicox"><p class="be b du z jh ji jj jk jl jm jn jo bj">David Rodenas, Ph. D.</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@drpicox/confirmed-code-coverage-is-a-useless-management-metric-35afa05e8549"><div title=""><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">Confirmed: Code Coverage Is a Useless Management Metric</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">Discover the simple proof that dismantles the code coverage metric.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@drpicox/confirmed-code-coverage-is-a-useless-management-metric-35afa05e8549"><div class="ab q"><div class="qv ab"><div class="bl" aria-hidden="false"></div>·13 min read·3 days ago</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F35afa05e8549&amp;operation=register&amp;redirect=https%3A%2F%2Fdrpicox.medium.com%2Fconfirmed-code-coverage-is-a-useless-management-metric-35afa05e8549&amp;user=David+Rodenas%2C+Ph.+D.&amp;userId=c8c6d9cd057c"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/@drpicox/confirmed-code-coverage-is-a-useless-management-metric-35afa05e8549?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">20</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="A mock sketch up of a series of three graphs showing distributions of low (red), medium (yellow) and high (green) CLV customers for different age segments. In this made up example, the distribution of the higher age segment includes more high CLV customers." rel="noopener follow" href="https://medium.com/towards-data-science/from-analytics-to-actual-application-the-case-of-customer-lifetime-value-91e482561c21"><div class="vc vd ve vf vg"><img alt="A mock sketch up of a series of three graphs showing distributions of low (red), medium (yellow) and high (green) CLV customers for different age segments. In this made up example, the distribution of the higher age segment includes more high CLV customers." class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/1*zXnL_f_UW9J3GI-IItPQUw.png" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@katherineamunro"><div class="l ff"><img alt="Katherine Munro" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*GkfHPiLfYUEgZcOYlWOMlw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@katherineamunro"><p class="be b du z jh ji jj jk jl jm jn jo bj">Katherine Munro</p><div class="zp zq l"><div class="ab zr"><div class="ab"></div></div></div></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/towards-data-science" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Towards Data Science</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/towards-data-science/from-analytics-to-actual-application-the-case-of-customer-lifetime-value-91e482561c21"><div title="From analytics to actual application: the case of Customer Lifetime Value"><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">From analytics to actual application: the case of Customer Lifetime Value</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">Part one of a comprehensive, practical guide to CLV techniques and real-world use-cases</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/towards-data-science/from-analytics-to-actual-application-the-case-of-customer-lifetime-value-91e482561c21"><div class="ab q"><div class="qv ab"><div class="bl" aria-hidden="false"></div>·9 min read·Jul 2</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Ftowards-data-science%2F91e482561c21&amp;operation=register&amp;redirect=https%3A%2F%2Ftowardsdatascience.com%2Ffrom-analytics-to-actual-application-the-case-of-customer-lifetime-value-91e482561c21&amp;user=Katherine+Munro&amp;userId=b84716d39740"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/towards-data-science/from-analytics-to-actual-application-the-case-of-customer-lifetime-value-91e482561c21?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">2</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv qv l"><div class="bg dv"><div class="dv l"><div class="dv un uo up uq ur us ut uu uv uw ux uy uz"><div class="va"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="What 50+ ML Interviews (as an Interviewer) Have Taught Me" rel="noopener follow" href="https://medium.com/towards-data-science/what-50-ml-interviews-as-an-interviewer-have-taught-me-6a72f7344eb1"><div class="vc vd ve vf vg"><img alt="What 50+ ML Interviews (as an Interviewer) Have Taught Me" class="bg vh vi vj vk vl" src="https://miro.medium.com/v2/resize:fit:1358/1*1FS7u5wtQBkrJQzhdtvjlw.jpeg" /></div></a></div><div class="vb ab ca cn"><div class="vm vn vo vp vq ab"><div class="qg l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@thushv89"><div class="l ff"><img alt="Thushan Ganegedara" class="l fa bx vr vs cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*ZvUQpHJ5tM21CP87WeGMTQ.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vt l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@thushv89"><p class="be b du z jh ji jj jk jl jm jn jo bj">Thushan Ganegedara</p><div class="zp zq l"><div class="ab zr"><div class="ab"></div></div></div></a></div></div></div><div class="vt l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/towards-data-science" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Towards Data Science</p></a></div></div></div></div><div class="vu vv vw vx vy vz wa wb wc wd l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/towards-data-science/what-50-ml-interviews-as-an-interviewer-have-taught-me-6a72f7344eb1"><div title=""><h2 class="be gr oi ok we wf ol om oo wg wh op nr wi wj wk wl nv wm wn wo wp nz wq wr ws wt jh jj jk jm jo bj">What 50+ ML Interviews (as an Interviewer) Have Taught Me</h2></div><div class="wu l"><h3 class="be b iq z jh wv jj jk ww jm jo dt">What can you do as an interviewer to leave a positive impression on a candidate regardless of the outcome?</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/towards-data-science/what-50-ml-interviews-as-an-interviewer-have-taught-me-6a72f7344eb1"><div class="ab q"><div class="qv ab"><div class="bl" aria-hidden="false"></div>·6 min read·4 days ago</div></div></a><div class="wx wy wz xa xb l"><div class="ab co"><div class="am xc xd xe xf xg xh xi xj xk xl ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Ftowards-data-science%2F6a72f7344eb1&amp;operation=register&amp;redirect=https%3A%2F%2Ftowardsdatascience.com%2Fwhat-50-ml-interviews-as-an-interviewer-have-taught-me-6a72f7344eb1&amp;user=Thushan+Ganegedara&amp;userId=6f0b045d5681"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xm l"><div><div class="bl" aria-hidden="false"><a class="af fg ah lb aj ak al lr an ao ap aq ar as at lq ab q ld ls" aria-label="responses" rel="noopener follow" href="https://medium.com/towards-data-science/what-50-ml-interviews-as-an-interviewer-have-taught-me-6a72f7344eb1?responsesOpen=true&amp;sortBy=REVERSE_CHRON"><p class="be b du z dt">6</p></a></div></div></div></div><div class="ab q xo xp"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article>]]></description>
      <link>https://medium.com/airbnb-engineering/chronon-a-declarative-feature-engineering-framework-b7b8ce796e04</link>
      <guid>https://medium.com/airbnb-engineering/chronon-a-declarative-feature-engineering-framework-b7b8ce796e04</guid>
      <pubDate>Tue, 11 Jul 2023 20:48:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Metis: Building Airbnb’s Next Generation Data Management Platform]]></title>
      <description><![CDATA[<div class=""><div class="hr hs ht hu hv"><div class="speechify-ignore ab co"><div class="speechify-ignore bg l"><div class="hw hx hy hz ia ab"><div><div class="ab ib"><a rel="noopener follow" href="https://medium.com/@zhengxb2005?source=post_page-----d2c5219edf19--------------------------------"><div><div class="bl" aria-hidden="false"><div class="l ic id bx ie if"><div class="l ff"><img alt="Xiaobin Zheng" class="l fa bx dc dd cw" src="https://miro.medium.com/v2/resize:fill:88:88/0*XRNumcimpnyRzc4W" width="44" height="44" /></div></div></div></div></a><a href="https://medium.com/airbnb-engineering?source=post_page-----d2c5219edf19--------------------------------" rel="noopener follow"><div class="ij ab ff"><div><div class="bl" aria-hidden="false"><div class="l ik il bx ie im"><div class="l ff"><img alt="The Airbnb Tech Blog" class="l fa bx bq in cw" src="https://miro.medium.com/v2/resize:fill:48:48/1*MlNQKg-sieBGW5prWoe9HQ.jpeg" width="24" height="24" /></div></div></div></div></div></a></div></div><div class="bm bg l"><div class="ab"><div><div class="io ab q"><div class="ab q ip"><div class="ab q"><div><div class="bl" aria-hidden="false"><p class="be b iq ir bj"><a class="af ag ah ai aj ak al am an ao ap aq ar is" rel="noopener follow" href="https://medium.com/@zhengxb2005?source=post_page-----d2c5219edf19--------------------------------">Xiaobin Zheng</a></p></div></div></div>·<p class="be b iq ir dt"><a class="iv iw ah ai aj ak al am an ao ap aq ar eu ix iy" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2F7fb3360a60d3&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmetis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19&amp;user=Xiaobin+Zheng&amp;userId=7fb3360a60d3&amp;source=post_page-7fb3360a60d3----d2c5219edf19---------------------post_header-----------">Follow</a></p></div></div></div></div><div class="l iz"><div class="ab cm ja jb jc"><div class="jd je ab"><div class="be b bf z dt ab jf">Published in<div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=post_page-----d2c5219edf19--------------------------------" rel="noopener follow"><p class="be b bf z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div><div class="h k">·</div></div><div class="ab ae">8 min read<div class="jp jq l" aria-hidden="true">·</div>Just now</div></div></div></div></div><div class="ab co jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg"><div class="h k w fc fd q"><div class="kw l"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fd2c5219edf19&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmetis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19&amp;user=Xiaobin+Zheng&amp;userId=7fb3360a60d3&amp;source=-----d2c5219edf19---------------------clap_footer-----------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div></div><div><div class="bl" aria-hidden="false"></div></div><div class="ab q kh ki kj kk kl km kn ko kp kq kr ks kt ku kv"><div class="h k"><div><div class="bl" aria-hidden="false"></div></div><div class="fa sw cm"><div class="l ae"><div class="ab ca"><div class="sx sy sz ta tb oc ch bg"><div class="ab"><div class="bl bg" aria-hidden="false"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div><div class="bl" aria-hidden="false" aria-describedby="postFooterSocialMenu" aria-labelledby="postFooterSocialMenu"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div><p id="6a28" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">How Airbnb evolved our data catalog into a platform for managing and governing our data warehouse at scale.</p><figure class="ns nt nu nv nw nx np nq paragraph-image"><div role="button" tabindex="0" class="ny nz ff oa bg ob"><div class="np nq nr"><picture></picture></div></div></figure><p id="5562" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">By:</strong> <a class="af oe" href="https://www.linkedin.com/in/erikrit/" rel="noopener ugc nofollow" target="_blank">Erik Ritter</a>, <a class="af oe" href="https://www.linkedin.com/in/jiaxin-ye-b249b259/" rel="noopener ugc nofollow" target="_blank">Jiaxin Ye</a>, <a class="af oe" href="https://www.linkedin.com/in/sylviatomiyama/" rel="noopener ugc nofollow" target="_blank">Sylvia Tomiyama</a>, <a class="af oe" href="https://www.linkedin.com/in/woodyzhou/" rel="noopener ugc nofollow" target="_blank">Woody Zhou</a>, <a class="af oe" href="http://www.linkedin.com/in/xiaobin-zheng" rel="noopener ugc nofollow" target="_blank">Xiaobin Zheng</a>, <a class="af oe" href="https://www.linkedin.com/in/zuzanavejrazkova/" rel="noopener ugc nofollow" target="_blank">Zuzana Vejrazkova</a></p><p id="e078" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">HEADER PHOTO SELECTED/DOWNLOADED FROM <a class="af oe" href="https://airbnb.orangelogic.com/CS.aspx?VP3=CMS3&amp;VF=HomeV2" rel="noopener ugc nofollow" target="_blank"><strong class="mt gr">DARYL</strong></a></p><h1 id="1c3f" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Introduction</h1><p id="fde9" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">At Airbnb, millions of data assets exist in a complex ecosystem to inform our business and improve our products. The Data Management team’s mission is to <strong class="mt gr">empower the company to manage its data ecosystem at scale</strong>.</p><p id="78bc" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">To do this, we need an accurate understanding of all of the assets in our ecosystem and how they relate to each other. In other words, it requires accurate metadata. Our data management platform Metis, named for the Greek goddess of good counsel, is our solution to ensure that trustworthy metadata can be <strong class="mt gr">captured</strong>, <strong class="mt gr">managed</strong>, and <strong class="mt gr">consumed</strong> at scale.</p><h1 id="2ea2" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">From humble beginnings</h1><p id="1c29" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Metis is an evolution of our existing foundation of metadata products within Airbnb.</p><p id="a12a" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><a class="af oe" rel="noopener" href="https://medium.com/airbnb-engineering/democratizing-data-at-airbnb-852d76c51770">Dataportal</a> was our first effort towards democratizing data: successfully enabling data users to find trusted data. It was a huge boon to productivity and pretty ahead of its time.</p><p id="35dd" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">As data reliability and compliance regulations became important, we needed a more comprehensive and detailed understanding of how data was transformed. This led to our adoption of <a class="af oe" href="https://atlas.apache.org/#/" rel="noopener ugc nofollow" target="_blank">Apache Atlas</a> as our data lineage solution. Apache Atlas powers products like SLA Tracker (see <a class="af oe" rel="noopener" href="https://medium.com/airbnb-engineering/visualizing-data-timeliness-at-airbnb-ee638fdf4710">Visualizing Data timeliness at Airbnb</a>), which combines landing time metadata and lineage to enable debugging upstream data delays.</p><p id="db1e" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">As our requirements for metadata increased, expanding to more areas like cost management, data quality, etc, our needs for a data catalog have expanded:</p><ul class=""><li id="e729" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">Ability to govern both the data and metadata describing it</li><li id="b69b" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Guardrails and recommendations to improve data quality</li><li id="9e7c" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Auditability of a dataset’s history, both for debugging &amp; governance purposes</li></ul><p id="a6d2" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We soon learned that data management had to be pursued as a discipline, thus building Metis as the one-stop-shop for accessing all data metadata.</p><h1 id="6db5" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">What we’ve built</h1><p id="bb40" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Metis is made up of three core products: Dataportal, Unified Metadata Service (UMS), and Lineage Service.Together, this platform allows Airbnb to manage <strong class="mt gr">millions of data assets</strong> across many domains. A short list of assets we support include:</p><ul class=""><li id="6d9c" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj"><a class="af oe" href="https://hive.apache.org/" rel="noopener ugc nofollow" target="_blank">Apache Hive</a> and Trino datasets</li><li id="6f72" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Metrics and Dimensions, powered by <a class="af oe" rel="noopener" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70">Airbnb’s Metric Platform: Minerva</a></li><li id="ab46" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Charts and Dashboards from <a class="af oe" rel="noopener" href="https://medium.com/airbnb-engineering/supercharging-apache-superset-b1a2393278bd">Apache Superset</a> and Tableau®</li><li id="5ee4" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Data Models, including those <a class="af oe" rel="noopener" href="https://medium.com/airbnb-engineering/data-quality-at-airbnb-e582465f3ef7">certified by Midas</a></li><li id="9133" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Machine Learning features and models</li><li id="7e61" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Teams and employees of Airbnb (not technically a data asset, but key to support high quality ownership and ensure metadata remains up to date for all the above data assets)</li></ul><h1 id="586c" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Metis Architecture</h1><figure class="ns nt nu nv nw nx np nq paragraph-image"><div role="button" tabindex="0" class="ny nz ff oa bg ob"><div class="np nq pu"><picture></picture></div></div><figcaption class="pv pw px np nq py pz be b bf z dt"><em class="qa">Metis Architecture</em></figcaption></figure><p id="b535" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">On a high level, Metis consists of following components:</p><p id="0e1f" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">Dataportal </strong>— serves as a catalog and management UI for human users.</p><p id="5b15" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">Viaduct — </strong><a class="af oe" rel="noopener" href="https://medium.com/airbnb-engineering/taming-service-oriented-architecture-using-a-data-oriented-service-mesh-da771a841344">Airbnb’s in-house GraphQL API layer</a> modeling offline data ecosystem.</p><p id="f0c4" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">UMS Core service </strong>— a backend service holding system schema and business logic needed for metadata management.</p><p id="e4d5" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">Metadata storage</strong></p><p id="19df" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">```<strong class="mt gr">MySQL</strong> — primarily storing critical metadata that needs to be centrally managed.</p><p id="0602" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">```<strong class="mt gr">Lineage Graph </strong>— a centralized service collecting and serving data lineage.</p><p id="2d55" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">```<strong class="mt gr">Elasticsearch</strong> — serving search &amp; discovery use cases.</p><p id="cf86" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">Offline Component</strong> — external to UMS Core service to perform offline tasks: e.g. offline metadata consistency check, policy enforcement.</p><p id="a4fb" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">Offline Dataset </strong>— offline export of metadata for analytics use cases.</p><h1 id="d28d" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Dataportal</h1><p id="da41" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Dataportal serves as the UI for Airbnb’s data catalog and is a place for people to find and manage all the assets supported by Metis. It’s built as a Single Page Application using React and TypeScript and is therefore flexible enough to serve the large variety of workflows required for data management and governance. The frontend communicates with UMS and other services via a GraphQL API; this is especially important as we want to prevent both sequential fetches of lineage information and over-fetching large amounts of metadata to ensure a performant user experience.</p><h1 id="7189" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Search and Discovery</h1><p id="2752" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">The Dataportal experience starts with search, so that both data consumers and data owners can find the assets they need.We’ve designed our search and discovery experience with a few principles in mind:</p><ul class=""><li id="f222" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">Display relevant metadata directly in the search results to help people find the exact asset they’re looking for</li><li id="8568" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Uprank high quality and commonly used data assets, in the case that the user is unaware of the exact asset they need</li></ul><p id="9f4e" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">As a result, search results tend to return high quality, certified datasets, along with the description, recent user count, and last time it was modified to help the user find which asset they want to select:</p><figure class="ns nt nu nv nw nx np nq paragraph-image"><div role="button" tabindex="0" class="ny nz ff oa bg ob"><div class="np nq qb"><picture></picture></div></div><figcaption class="pv pw px np nq py pz be b bf z dt"><em class="qa">Dataportal search result page</em></figcaption></figure><h1 id="e720" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Management Capabilities</h1><p id="3eb7" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Once the desired asset is located, the user can visit the <strong class="mt gr">Entity Page</strong> to perform a large variety of consumption, management, and governance actions. We structure all the content on the entity page into tabs grouped by category of data or action:</p><figure class="ns nt nu nv nw nx np nq paragraph-image"><div role="button" tabindex="0" class="ny nz ff oa bg ob"><div class="np nq qc"><picture></picture></div></div><figcaption class="pv pw px np nq py pz be b bf z dt">The many tabs available for metadata on Hive Tables</figcaption></figure><p id="223e" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Consumption and documentation related tabs make it easy for people to learn how to use this table, with column and table descriptions in the Configuration tab, owner and consumer data on the Points of contact tab, and further details on how to use the table on the Documentation tab. Beyond that, these pages also allow users to take on management activities, as seen in the below screenshots:</p><figure class="ns nt nu nv nw nx np nq paragraph-image"><div role="button" tabindex="0" class="ny nz ff oa bg ob"><div class="np nq qd"><picture></picture></div></div><figcaption class="pv pw px np nq py pz be b bf z dt"><em class="qa">Anyone can tag columns that contain personal data. Changing and removing tags require a review to ensure that personal data is correctly identified in our warehouse.</em></figcaption></figure><p id="bec9" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">The above screenshot highlights only a subset of ways we leveled up the Dataportal from a searchable data catalog into the one centralized place to manage and govern all your data assets.</p><h1 id="bb90" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Unified Metadata Service</h1><p id="cc8c" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Unified Metadata Service, or UMS, is the backend core of our centralized data management platform. It provides:</p><ul class=""><li id="a6f1" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">A centralized schema and Graphql API layer on top of it to access metadata</li><li id="0173" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">A centralized relationship graph to connect siloed metadata</li><li id="307e" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Centralized metadata management capabilities to enable systems to meet compliance and governance requirements without reinventing the wheel</li></ul><p id="302b" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">The centralization of metadata into UMS prevents all metadata providers and consumers from needing to integrate with each other; instead all providers and consumers only must integrate with UMS:</p><figure class="ns nt nu nv nw nx np nq paragraph-image"><div role="button" tabindex="0" class="ny nz ff oa bg ob"><div class="np nq qe"><picture></picture></div></div><figcaption class="pv pw px np nq py pz be b bf z dt">Reducing integration points for metadata</figcaption></figure><h1 id="1017" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Metadata Integration Patterns</h1><p id="4f90" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">UMS plays various roles across metadata integrations and use cases. In a decentralized data ecosystem, we are very opinionated about what metadata should be stored, replicated to, or served through UMS.</p><h1 id="bd82" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Unified presentation layer proxying requests</h1><p id="97b1" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">UMS supports proxying read requests to many data systems. This includes proxying read requests to:</p><ul class=""><li id="2af4" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">Hive Metastore for table schema and table properties.</li><li id="eac2" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Lineage service for raw Hive table data lineage.</li><li id="06de" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Data Governance service for data governance status for datasets.</li></ul><h1 id="112a" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Metadata management service</h1><p id="775c" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">UMS centrally manages a few critical business metadata and stores in its own metadata database with management capabilities:</p><ul class=""><li id="9f11" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">Validation and authorization for updates</li><li id="950b" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Audit history</li><li id="bdf1" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Approval workflow for sensitive operations on critical metadata</li></ul><h1 id="e37c" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Supporting online use cases for offline generated metadata</h1><p id="89a5" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">As part of Airbnb’s <a class="af oe" rel="noopener" href="https://medium.com/airbnb-engineering/data-quality-at-airbnb-e582465f3ef7">Data Quality Initiative</a>, we implemented data quality scores that are directly tied to each data asset in the data warehouse. Data quality scores for datasets are generated in an offline manner and ingested into UMS metadata database for online consumption.</p><h1 id="9839" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Centrally managed search indexes powering data discovery</h1><p id="d425" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Similar to traditional data catalog, UMS centrally manages indexes in an Elasticsearch cluster for different entities to power data discovery.</p><h1 id="9811" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Metadata Ingestion</h1><p id="7dc1" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">There are cases where metadata needs to be stored or replicated into Metis storage layer. UMS integrates with metadata providers in a variety of paved mechanisms to ingest metadata leveraging Airbnb’s tech stack. These include:</p><ul class=""><li id="028b" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">Stream processing (Flink) jobs ingesting metadata change events.</li><li id="f05e" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">ETL(Airflow) jobs that run daily to pull from metadata providers and push to UMS.</li><li id="01ce" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Direct calls to UMS API.</li></ul><p id="3c56" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">When we onboard a new metadata provider, the key work involved is identifying product requirements and aligning on the scope of metadata integration, followed by finalizing the actual integration mechanism.</p><h1 id="a976" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Lineage Service</h1><p id="b742" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">The final major piece of Metis is our Lineage Service. We adopted Apache Atlas as Airbnb’s data lineage solution for Data Warehouse back in 2020.</p><p id="1c53" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">At Airbnb, Apache Atlas holds a large lineage graph containing over 100 million nodes and 300 million edges. The primary volume of lineage data comes from production Hive tables and a large volume of intermediate Hive tables in our Data Warehouse.</p><p id="9658" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We have extensively customized and tuned Apache Atlas to handle the large scale lineage events in our Data Warehouse:</p><ul class=""><li id="9010" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">Apply sharding strategy on lineage events to increase parallelism.</li><li id="6577" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Improving Atlas server code efficiency on top of a graph database.</li><li id="6780" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Fine tuning underlying storage systems backing the graph database for scalability and latency.</li><li id="c3ff" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Read path optimization and filtering support for accessing lineage data more efficiently.</li></ul><p id="1f3c" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Atlas’s lineage-related components, including its Graph Engine (JanusGraph), Type System, Ingest (with Hook integrations), and lineage API, have allowed us to efficiently collect and serve lineage data, providing valuable insights into the relationships between various data assets and pipelines. It is powering many critical data compliance, data reliability and data quality products. See <a class="af oe" rel="noopener" href="https://medium.com/airbnb-engineering/visualizing-data-timeliness-at-airbnb-ee638fdf4710">Visualizing Data Timeliness at Airbnb</a>.</p><h1 id="d4d2" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Conclusion &amp; Appreciations</h1><p id="46ee" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">As shown above, Airbnb’s approach to data management has significantly evolved over the past 6 years. We started building <a class="af oe" rel="noopener" href="https://medium.com/airbnb-engineering/democratizing-data-at-airbnb-852d76c51770">Dataportal</a> with a goal to “democratize data” at Airbnb, and we now have Metis: a platform that enables anyone at Airbnb to search, discover, consume, and manage all the data and metadata in our offline warehouse. Metis has been serving critical roles across data compliance, data reliability, data quality initiatives and is helping 1000+ data users every week.</p><p id="5dea" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Our future work will involve two key priorities: firstly, we will focus on evolving our system architecture and underlying technology in order to keep pace with the rapid evolution of our data ecosystem. Secondly, we plan to expand our coverage to more systems and enable more advanced data management capabilities, reflecting our ongoing commitment to investing in data here at Airbnb.</p><p id="88d5" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Metis would not have been possible without the members of the data management team as well as our cross functional and cross org collaborators. They include, but are not limited to: Adam Kocoloski, Adam Wong, Cindy Yu, Dave Nagle, Erik Ritter, Jerry Wang, Jiaxin Ye, John Bodley, Jyoti Wadhwani, Liyin Tang, Michelle Thomas, Nathan Towery, Paul Ellwood, Sylvia Tomiyama, Vyl Chiang, Woody Zhou, Xiaobin Zheng, and Zuzana Vejrazkova.</p><p id="a5a3" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Apache Airflow, Apache Atlas, Apache Hive, Apache Superset, Atlas, and Hive are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p><p id="06dc" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">All trademarks, service marks, company names and product names are the property of their respective owners. Any use of these are for identification purposes only and do not imply sponsorship and endorsement.</p></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19</link>
      <guid>https://medium.com/airbnb-engineering/metis-building-airbnbs-next-generation-data-management-platform-d2c5219edf19</guid>
      <pubDate>Thu, 08 Jun 2023 19:09:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Improving Performance with HTTP Streaming]]></title>
      <description><![CDATA[<article><div class="l"><div class="l"><section><div><div class="gj gk gl gm gn"><div class="ab ca"><div class="ch bg fv fw fx fy"><div class=""><div class="hr hs ht hu hv"><div class="speechify-ignore ab co"><div class="speechify-ignore bg l"><div class="hw hx hy hz ia ab"><div><div class="ab ib"><a rel="noopener follow" href="https://medium.com/@hey.victor?source=post_page-----ba9e72c66408--------------------------------"><div><div class="bl" aria-hidden="false"><div class="l ic id bx ie if"><div class="l ff"><img alt="Victor" class="l fa bx dc dd cw" src="https://miro.medium.com/v2/resize:fill:88:88/1*02446_XHOIpleO6C7SvQ4A.jpeg" width="44" height="44" /></div></div></div></div></a><a href="https://medium.com/airbnb-engineering?source=post_page-----ba9e72c66408--------------------------------" rel="noopener follow"><div class="ij ab ff"><div><div class="bl" aria-hidden="false"><div class="l ik il bx ie im"><div class="l ff"><img alt="The Airbnb Tech Blog" class="l fa bx bq in cw" src="https://miro.medium.com/v2/resize:fill:48:48/1*MlNQKg-sieBGW5prWoe9HQ.jpeg" width="24" height="24" /></div></div></div></div></div></a></div></div><div class="bm bg l"><div class="ab"><div><div class="io ab q"><div class="ab q ip"><div class="ab q"><div><div class="bl" aria-hidden="false"><p class="be b iq ir bj"><a class="af ag ah ai aj ak al am an ao ap aq ar is" rel="noopener follow" href="https://medium.com/@hey.victor?source=post_page-----ba9e72c66408--------------------------------">Victor</a></p></div></div></div>·<p class="be b iq ir dt"><a class="iv iw ah ai aj ak al am an ao ap aq ar eu ix iy" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2Fe46fded15590&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fimproving-performance-with-http-streaming-ba9e72c66408&amp;user=Victor&amp;userId=e46fded15590&amp;source=post_page-e46fded15590----ba9e72c66408---------------------post_header-----------">Follow</a></p></div></div></div></div><div class="l iz"><div class="ab cm ja jb jc"><div class="jd je ab"><div class="be b bf z dt ab jf">Published in<div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=post_page-----ba9e72c66408--------------------------------" rel="noopener follow"><p class="be b bf z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div><div class="h k">·</div></div><div class="ab ae">7 min read<div class="jp jq l" aria-hidden="true">·</div>Just now</div></div></div></div></div><div class="ab co jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg"><div class="h k w fc fd q"><div class="kw l"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fba9e72c66408&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fimproving-performance-with-http-streaming-ba9e72c66408&amp;user=Victor&amp;userId=e46fded15590&amp;source=-----ba9e72c66408---------------------clap_footer-----------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div></div><div><div class="bl" aria-hidden="false"></div></div><div class="ab q kh ki kj kk kl km kn ko kp kq kr ks kt ku kv"><div class="h k"><div><div class="bl" aria-hidden="false"></div></div><div class="fa ti cm"><div class="l ae"><div class="ab ca"><div class="tj tk tl tm tn od ch bg"><div class="ab"><div class="bl bg" aria-hidden="false"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div><div class="bl" aria-hidden="false" aria-describedby="postFooterSocialMenu" aria-labelledby="postFooterSocialMenu"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div><p id="7a08" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">How HTTP Streaming can improve page performance and how Airbnb enabled it on an existing codebase</p><p id="9dd1" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">By:</strong> <a class="af np" href="https://www.linkedin.com/in/victorhlin/" rel="noopener ugc nofollow" target="_blank">Victor Lin</a></p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr ns"><picture></picture></div></div></figure><h1 id="715f" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Introduction</h1><p id="a054" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">You may have heard a joke that the <a class="af np" href="https://en.wikipedia.org/wiki/Series_of_tubes" rel="noopener ugc nofollow" target="_blank">Internet is a series of tubes</a>. In this blog post, we’re going to talk about how we get a cool, refreshing stream of Airbnb.com bytes into your browser as quickly as possible using HTTP Streaming.</p><p id="f9bf" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Let’s first understand what streaming means. Imagine we had a spigot and two options:</p><ul class=""><li id="7ad4" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">Fill a big cup, and then pour it all down the tube (the “buffered” strategy)</li><li id="4438" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Connect the spigot directly to the tube (the “streaming” strategy)</li></ul><p id="f095" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">In the buffered strategy, everything happens sequentially — our servers first generate the entire response into a buffer (filling the cup), and then more time is spent sending it over the network (pouring it down). The streaming strategy happens in parallel. We break the response into chunks, which are sent as soon as they are ready. The server can start working on the next chunk while previous chunks are still being sent, and the client (e.g, a browser) can begin handling the response before it has been fully received.</p><h1 id="d6eb" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Implementing Streaming at Airbnb</h1><p id="ca5b" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Streaming has clear advantages, but most websites today still rely on a buffered approach to generate responses. One reason for this is the additional engineering effort required to break the page into independent chunks. This just isn’t feasible sometimes. For example, if all of the content on the page relies on a slow backend query, then we won’t be able to send anything until that query finishes.</p><p id="806e" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">However, there’s one use case that’s universally applicable. We can use streaming to reduce <strong class="mt gr">network waterfalls</strong>. This term refers to when one network request triggers another, resulting in a cascading series of sequential requests. This is easily visualized in a tool like Chrome’s <a class="af np" href="https://developer.chrome.com/docs/devtools/network/reference/#waterfall" rel="noopener ugc nofollow" target="_blank">Waterfall</a>:</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr pt"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt">Chrome Network Waterfall illustrating a cascade of sequential requests</figcaption></figure><p id="bde8" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Most web pages rely on external JavaScript and CSS files linked within the HTML, resulting in a network waterfall — downloading the HTML triggers JavaScript and CSS downloads. As a result, it’s a best practice to place all CSS and JavaScript tags near the beginning of the HTML in the <code class="cw pz qa qb qc b">&lt;head&gt;</code> tag. This ensures that the browser sees them earlier. With streaming, we can reduce this delay further, by sending that portion of the <code class="cw pz qa qb qc b">&lt;head&gt;</code> tag first.</p><h1 id="ea4f" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Early Flush</h1><p id="98d3" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">The most straightforward way to send an early <code class="cw pz qa qb qc b">&lt;head&gt;</code> tag is by breaking a standard response into two parts. This technique is called <strong class="mt gr">Early Flush</strong>, as one part is sent (“flushed”) before the other.</p><p id="3146" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">The first part contains things that are fast to compute and can be sent quickly. At Airbnb, we include tags for fonts, CSS, and JavaScript, so that we get the browser benefits mentioned above. The second part contains the rest of the page, including content that relies on API or database queries to compute. The end result looks like this:</p><p id="7a7e" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Early chunk:</p><pre class="nt nu nv nw nx qd qc qe bo qf qg qh">&lt;html&gt;  &lt;head&gt;    &lt;script src=… defer /&gt;    &lt;link rel=”stylesheet” href=… /&gt;    &lt;!--lots of other &lt;meta&gt; and other tags… -&gt;</pre><p id="7ff4" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Late chunk:</p><pre class="nt nu nv nw nx qd qc qe bo qf qg qh">&lt;!-- &lt;head&gt; tags that depend on data go here -&gt;  &lt;/head&gt;  &lt;body&gt;    &lt;! — Body content here →  &lt;/body&gt;&lt;/html&gt;</pre><p id="2ee9" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We had to restructure our app to make this possible. For context, Airbnb uses an Express-based NodeJS server to render web pages using React. We previously had a single React component in charge of rendering the complete HTML document. However, this presented two problems:</p><ul class=""><li id="9d43" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">Producing incremental chunks of content means we need to work with partial/unclosed HTML tags. For example, the examples you saw above are invalid HTML. The <code class="cw pz qa qb qc b">&lt;html&gt;</code> and <code class="cw pz qa qb qc b">&lt;head&gt;</code> tags are opened in the Early chunk, but closed in the Late chunk. There’s no way to generate this sort of output using the standard React rendering functions.</li><li id="5fca" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">We can’t render this component until we have all of the data for it.</li></ul><p id="904a" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We solved these problems by breaking our monolithic component into three:</p><ul class=""><li id="3763" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">an “Early &lt;head&gt;” component</li><li id="bb0e" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">a “Late &lt;head&gt;” component, for &lt;head&gt; tags that depend on data</li><li id="9f1c" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">a “&lt;body&gt;” component</li></ul><p id="03a1" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Each component renders the <em class="qn">contents</em> of the head or body tag. Then we stitch them together by writing open/close tags directly to the HTTP response stream. Overall, the process looks like this:</p><ol class=""><li id="5565" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no qo pm pn bj">Write <code class="cw pz qa qb qc b">&lt;html&gt;&lt;head&gt;</code></li><li id="975a" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qo pm pn bj">Render and write the Early &lt;head&gt; to the response</li><li id="c5b9" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qo pm pn bj">Wait for data</li><li id="4d15" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qo pm pn bj">Render and write the Late &lt;head&gt; to the response</li><li id="c855" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qo pm pn bj">Write <code class="cw pz qa qb qc b">&lt;/head&gt;&lt;body&gt;</code></li><li id="b3e9" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qo pm pn bj">Render and write the &lt;body&gt; to the response</li><li id="2388" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qo pm pn bj">Finish up by writing <code class="cw pz qa qb qc b">&lt;/body&gt;&lt;/html&gt;</code></li></ol><h1 id="364f" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Data Streaming</h1><p id="f357" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Early Flush optimizes CSS and JavaScript network waterfalls. However, users will still be staring at a blank page until the <code class="cw pz qa qb qc b">&lt;body&gt;</code> tag arrives. We’d like to improve this by rendering a loading state when there’s no data, which gets replaced once the data arrives. Conveniently, we already have loading states in this situation for client side routing, so we could accomplish this by just rendering the app without waiting for data!</p><p id="4e58" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Unfortunately, this causes another network waterfall. Browsers have to receive the SSR (Server-Side Render), and then JavaScript triggers another network request to fetch the actual data:</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qp"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt">Graph showing a network waterfall where SSR and client-side data fetch happen sequentially</figcaption></figure><p id="6218" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">In our testing, this resulted in a slower <em class="qn">total</em> loading time.</p><p id="44fa" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">What if we could include this data in the HTML? This would allow our server-side rendering and data fetching to happen in parallel:</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qq"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt">Graph showing SSR and client-side data fetch happening in parallel</figcaption></figure><p id="b8bc" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Given that we had already broken the page into two chunks with Early Flush, it’s relatively straightforward to introduce a third chunk for what we call <strong class="mt gr">Deferred Data</strong>. This chunk goes after all of the visible content and does not block rendering. We execute the network requests on the server and stream the responses into the Deferred Data chunk. In the end, our three chunks look like this:</p><p id="3cb8" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Early chunk</p><pre class="nt nu nv nw nx qd qc qe bo qf qg qh">&lt;html&gt;  &lt;head&gt;    &lt;link rel=”preload” as=”script” href=… /&gt;    &lt;link rel=”stylesheet” href=… /&gt;    &lt;! — lots of other &lt;meta&gt; and other tags… →</pre><p id="a45c" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Body chunk</p><pre class="nt nu nv nw nx qd qc qe bo qf qg qh">    &lt;! — &lt;head&gt; tags that depend on data go here →  &lt;/head&gt;  &lt;body&gt;     &lt;! — Body content here →     &lt;script src=… /&gt;</pre><p id="6042" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Deferred Data chunk</p><pre class="nt nu nv nw nx qd qc qe bo qf qg qh">    &lt;script type=”application/json” &gt;      &lt;!-- data --&gt;    &lt;/script&gt;   &lt;/body&gt;&lt;/html&gt;</pre><p id="beef" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">With this implemented on the server, the only remaining task is to write some JavaScript to detect when our Deferred Data chunk arrives. We did this with a <a class="af np" href="https://developer.mozilla.org/en-US/docs/Web/API/MutationObserver" rel="noopener ugc nofollow" target="_blank">MutationObserver</a>, which is an efficient way to observe DOM changes. Once the Deferred Data JSON element is detected, we parse the result and inject it into our application’s network data store. From the application’s perspective, it’s as though a normal network request has been completed.</p><p id="64b4" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">Watch out for `defer`</strong></p><p id="3228" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">You may notice that some tags are re-ordered from the Early Flush example. The script tags moved from the Early chunk to the Body chunk and no longer have the <a class="af np" href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element/script#attributes" rel="noopener ugc nofollow" target="_blank">defer attribute</a>. This attribute avoids render-blocking script execution by deferring scripts until after the HTML has been downloaded and parsed. This is suboptimal when using Deferred Data, as all of the visible content has already been received by the end of the Body chunk, and we no longer worry about render-blocking at that point. We can fix this by moving the script tags to the end of the Body chunk, and removing the defer attribute. Moving the tags later in the document does introduce a network waterfall, which we solved by adding <a class="af np" href="https://developer.mozilla.org/en-US/docs/Web/HTML/Attributes/rel/preload" rel="noopener ugc nofollow" target="_blank">preload</a> tags into the Early chunk.</p><h1 id="1672" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Implementation Challenges</h1><h1 id="d693" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Status codes and headers</h1><p id="254a" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Early Flush prevents subsequent changes to the headers (e.g to redirect or change the status code). In the React + NodeJS world, it’s common to delegate redirects and error throwing to a React app rendered after the data has been fetched. This won’t work if you’ve already sent an early <code class="cw pz qa qb qc b">&lt;head&gt;</code> tag and a 200 OK status.</p><p id="6ddd" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We solved this problem by moving error and redirect logic out of our React app. That logic is now performed in <a class="af np" href="https://expressjs.com/en/guide/using-middleware.html" rel="noopener ugc nofollow" target="_blank">Express server middleware</a> before we attempt to Early Flush.</p><h1 id="802d" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Buffering</h1><p id="f77b" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">We found that <a class="af np" href="https://www.nginx.com/resources/wiki/start/topics/examples/x-accel/#x-accel-buffering" rel="noopener ugc nofollow" target="_blank">nginx</a> buffer responses by default. This has resource utilization benefits but is counterproductive when the goal is sending incremental responses. We had to configure these services to disable buffering. We expected a potential increase in resource usage with this change but found the impact to be negligible.</p><h1 id="534f" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Response delays</h1><p id="4a34" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">We noticed that our Early Flush responses had an unexpected delay of around 200ms, which disappeared when we disabled gzip compression. This turned out to be an interaction between <a class="af np" href="https://en.wikipedia.org/wiki/Nagle%27s_algorithm" rel="noopener ugc nofollow" target="_blank">Nagle’s algorithm</a> and <a class="af np" href="https://en.wikipedia.org/wiki/TCP_delayed_acknowledgment" rel="noopener ugc nofollow" target="_blank">Delayed ACK</a>. These optimizations attempt to maximize data sent per packet, introducing latency when sending small amounts of data. It’s especially easy to run into this issue with <a class="af np" href="https://en.wikipedia.org/wiki/Jumbo_frame" rel="noopener ugc nofollow" target="_blank">jumbo frames</a>, which increases maximum packet sizes. It turns out that gzip reduced the size of our writes to the point where they couldn’t fill a packet, and the solution was to disable Nagle’s algorithm in our <a class="af np" href="https://www.haproxy.com/documentation/hapee/latest/onepage/#4.2-option%20http-no-delay" rel="noopener ugc nofollow" target="_blank">haproxy</a> load balancer.</p><h1 id="72d0" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Conclusion</h1><p id="afbd" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">HTTP Streaming has been a very successful strategy for improving web performance at Airbnb. Our experiments showed that Early Flush produced a flat reduction in <a class="af np" href="https://web.dev/fcp/" rel="noopener ugc nofollow" target="_blank">First Contentful Paint</a> (FCP) of around 100ms on every page tested, including the Airbnb homepage. Data streaming further eliminated the FCP costs of slow backend queries. While there were challenges along the way, we found that adapting our existing React application to support streaming was very feasible and robust, despite not being designed for it originally. We’re also excited to see the broader frontend ecosystem trend in the direction of prioritizing streaming, from <a class="af np" href="https://graphql.org/blog/2020-12-08-improving-latency-with-defer-and-stream-directives/" rel="noopener ugc nofollow" target="_blank">@defer and @stream in GraphQL</a> to <a class="af np" href="https://nextjs.org/docs/advanced-features/react-18/streaming" rel="noopener ugc nofollow" target="_blank">streaming SSR in Next.js</a>. Whether you’re using these new technologies, or extending an existing codebase, we hope you’ll explore streaming to build a faster frontend for all!</p><p id="ecb2" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">If this type of work interests you, check out some of our related positions <a class="af np" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">here</a>.</p><h1 id="55ba" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Acknowledgments</h1><p id="2e78" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Elliott Sprehn, Aditya Punjani, Jason Jian, Changgeng Li, Siyuan Zhou, Bruce Paul, Max Sadrieh, and everyone else who helped design and implement streaming at Airbnb!</p><h1 id="1e8d" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">****************</h1><p id="1416" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj"><em class="qn">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div></div></div></div></section></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----ba9e72c66408----0---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="vv vw vx vy vz"><img alt="" class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/1*X0-h_g8Qrt3TWzbOuBzzMw.jpeg" role="presentation" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=author_recirc-----ba9e72c66408----0---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="l ff"><img alt="Lauren Mackevich" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*-imhApAGWwgM89i1.jpg" width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=author_recirc-----ba9e72c66408----0---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Lauren Mackevich</p></a></div></div></div><div class="wl l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----ba9e72c66408----0---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----ba9e72c66408----0---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div title=""><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">My Journey to Airbnb — Michael Kinoti</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">Saying no to med school and following a dream all the way to Silicon Valley</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----ba9e72c66408----0---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="ab q">7 min read·Apr 26</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F645d4c228d06&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-michael-kinoti-645d4c228d06&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057&amp;source=-----645d4c228d06----0-----------------clap_footer----6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----ba9e72c66408----0---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/journey-platform-a-low-code-tool-for-creating-interactive-user-workflows-9954f51fa3f8?source=author_recirc-----ba9e72c66408----1---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="vv vw vx vy vz"><img alt="" class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/1*rLBEt8kz__tykm6lCLxDtQ.jpeg" role="presentation" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@arjun.raman?source=author_recirc-----ba9e72c66408----1---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="l ff"><img alt="Arjun Raman" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*2WGqAeKadV1z3xZ2" width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@arjun.raman?source=author_recirc-----ba9e72c66408----1---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Arjun Raman</p></a></div></div></div><div class="wl l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----ba9e72c66408----1---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/journey-platform-a-low-code-tool-for-creating-interactive-user-workflows-9954f51fa3f8?source=author_recirc-----ba9e72c66408----1---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div title="Journey Platform: A low-code tool for creating interactive user workflows"><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">Journey Platform: A low-code tool for creating interactive user workflows</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">Journey Platform: Low-code notification workflow platform that allows technical and non-technical users to create complex workflows through…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/journey-platform-a-low-code-tool-for-creating-interactive-user-workflows-9954f51fa3f8?source=author_recirc-----ba9e72c66408----1---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="ab q">9 min read·5 days ago</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F9954f51fa3f8&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fjourney-platform-a-low-code-tool-for-creating-interactive-user-workflows-9954f51fa3f8&amp;user=Arjun+Raman&amp;userId=7b20e16d6d70&amp;source=-----9954f51fa3f8----1-----------------clap_footer----6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/journey-platform-a-low-code-tool-for-creating-interactive-user-workflows-9954f51fa3f8?source=author_recirc-----ba9e72c66408----1---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f?source=author_recirc-----ba9e72c66408----2---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="vv vw vx vy vz"><img alt="A person leans over the edge of a balcony. In the background are trees." class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/1*mGebUVa4KQWzQvo_YDSffQ.jpeg" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@michaelbachand?source=author_recirc-----ba9e72c66408----2---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="l ff"><img alt="Michael Bachand" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*hRzU_BfPKFdM77OfC1eYQw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@michaelbachand?source=author_recirc-----ba9e72c66408----2---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Michael Bachand</p></a></div></div></div><div class="wl l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----ba9e72c66408----2---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f?source=author_recirc-----ba9e72c66408----2---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div title=""><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">Flexible Continuous Integration for iOS</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">How Airbnb leverages AWS, Packer, and Terraform to update macOS on hundreds of CI machines in hours instead of days</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f?source=author_recirc-----ba9e72c66408----2---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="ab q">10 min read·6 days ago</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F4ab33ea4072f&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fflexible-continuous-integration-for-ios-4ab33ea4072f&amp;user=Michael+Bachand&amp;userId=90f72207e307&amp;source=-----4ab33ea4072f----2-----------------clap_footer----6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f?source=author_recirc-----ba9e72c66408----2---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----ba9e72c66408----3---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="vv vw vx vy vz"><img alt="" class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/0*CedYKpSYMIGEiX7m" role="presentation" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rbro112?source=author_recirc-----ba9e72c66408----3---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="l ff"><img alt="Ryan Brooks" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*py_8uAIKHqAuW89G5PgOeQ.png" width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@rbro112?source=author_recirc-----ba9e72c66408----3---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Ryan Brooks</p></a></div></div></div><div class="wl l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----ba9e72c66408----3---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----ba9e72c66408----3---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div title=""><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">A Deep Dive into Airbnb’s Server-Driven UI System</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">How Airbnb ships features faster across web, iOS, and Android using a server-driven UI system named Ghost Platform ?.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----ba9e72c66408----3---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="ab q">11 min read·Jun 29, 2021</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F842244c5f5&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fa-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5&amp;user=Ryan+Brooks&amp;userId=4c31895f4c38&amp;source=-----842244c5f5----3-----------------clap_footer----6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----ba9e72c66408----3---------------------6d1cddfa_ad99_44ab_9c25_b6156422c5d4-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/api-center/api-design-practice-7fce69e6336c?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="vv vw vx vy vz"><img alt="" class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/1*AXnwkiFv_ffqb2U2dMYZQg.png" role="presentation" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@trgoodwill?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="l ff"><img alt="TRGoodwill" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*6Q4eKm3wVU3RGBzw3WMy4g.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@trgoodwill?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">TRGoodwill</p></a></div></div></div><div class="wl l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/api-center?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">API Central</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/api-center/api-design-practice-7fce69e6336c?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div title=""><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">API Design Practice</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">A practical guide to API QA and the design of stable, coherent and composable business resource APIs</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/api-center/api-design-practice-7fce69e6336c?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="ab q">10 min read·May 9</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fapi-center%2F7fce69e6336c&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fapi-center%2Fapi-design-practice-7fce69e6336c&amp;user=TRGoodwill&amp;userId=6d4fcf8d1248&amp;source=-----7fce69e6336c----0-----------------clap_footer----3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/api-center/api-design-practice-7fce69e6336c?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/@vishalrana9915/event-driven-architecture-620d14a4701e?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="vv vw vx vy vz"><img alt="" class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/1*618K893NrLM5JCJuWUZ9vg.png" role="presentation" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@vishalrana9915?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="l ff"><img alt="vishal rana" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*FVh3dK70N61Mx5zYDr7D7g.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@vishalrana9915?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">vishal rana</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/@vishalrana9915/event-driven-architecture-620d14a4701e?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div title=""><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">Event Driven Architecture</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">If you’re tired of dealing with the complexity and inflexibility of traditional request-response architectures, it might be time to…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/@vishalrana9915/event-driven-architecture-620d14a4701e?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="ab q">8 min read·Jan 10</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F620d14a4701e&amp;operation=register&amp;redirect=https%3A%2F%2Fvishalrana9915.medium.com%2Fevent-driven-architecture-620d14a4701e&amp;user=vishal+rana&amp;userId=14ed821ad7d1&amp;source=-----620d14a4701e----1-----------------clap_footer----3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@vishalrana9915/event-driven-architecture-620d14a4701e?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/@abhishekranjandev/concurrency-conundrum-in-booking-systems-2e53dc717e8c?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="vv vw vx vy vz"><img alt="" class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/1*kiQHiyetILWcgzlNpvfeaw.png" role="presentation" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@abhishekranjandev?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="l ff"><img alt="Abhishek Ranjan" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*zLOZXXaU-IkTzRIUXARVlQ.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@abhishekranjandev?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Abhishek Ranjan</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/@abhishekranjandev/concurrency-conundrum-in-booking-systems-2e53dc717e8c?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div title=""><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">Concurrency Conundrum in Booking Systems</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">Recently I was involved in a discussion with a colleague about an age-old problem of handling duplicates while booking. I was surprised to…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/@abhishekranjandev/concurrency-conundrum-in-booking-systems-2e53dc717e8c?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="ab q">5 min read·Mar 23</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F2e53dc717e8c&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40abhishekranjandev%2Fconcurrency-conundrum-in-booking-systems-2e53dc717e8c&amp;user=Abhishek+Ranjan&amp;userId=636e42da7585&amp;source=-----2e53dc717e8c----0-----------------clap_footer----3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@abhishekranjandev/concurrency-conundrum-in-booking-systems-2e53dc717e8c?source=read_next_recirc-----ba9e72c66408----0---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/@dbottiau/the-rise-of-the-serverless-monoliths-63d3d2d98164?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="vv vw vx vy vz"><img alt="" class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/0*Z8I4cyS4i0Pp3qKY.png" role="presentation" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@dbottiau?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="l ff"><img alt="David Bottiau" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*aa-rpTTvGYpgJ74IB1yxqg.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@dbottiau?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">David Bottiau</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/@dbottiau/the-rise-of-the-serverless-monoliths-63d3d2d98164?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div title=""><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">The Rise of the Serverless Monoliths</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">Over the past few decades, we have seen application architectures evolve at a rapid pace. When I was a young developer, I myself started…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/@dbottiau/the-rise-of-the-serverless-monoliths-63d3d2d98164?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="ab q">6 min read·Feb 8</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F63d3d2d98164&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40dbottiau%2Fthe-rise-of-the-serverless-monoliths-63d3d2d98164&amp;user=David+Bottiau&amp;userId=d48e0437a804&amp;source=-----63d3d2d98164----1-----------------clap_footer----3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@dbottiau/the-rise-of-the-serverless-monoliths-63d3d2d98164?source=read_next_recirc-----ba9e72c66408----1---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/better-programming/why-an-engineering-manager-should-not-review-code-46f87c08db66?source=read_next_recirc-----ba9e72c66408----2---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="vv vw vx vy vz"><img alt="" class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/1*9TRoKOW3BF9pLE0DBOIqMw.png" role="presentation" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@emilydresner?source=read_next_recirc-----ba9e72c66408----2---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="l ff"><img alt="Emily Dresner" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*triDytYdY_YbffNp." width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@emilydresner?source=read_next_recirc-----ba9e72c66408----2---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Emily Dresner</p></a></div></div></div><div class="wl l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/better-programming?source=read_next_recirc-----ba9e72c66408----2---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Better Programming</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/better-programming/why-an-engineering-manager-should-not-review-code-46f87c08db66?source=read_next_recirc-----ba9e72c66408----2---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div title=""><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">Why an Engineering Manager Should Not Review Code</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">When discussing team organization, I am often asked: “Why don’t you have the tech lead manage the team?” My response is to hiss like a…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/better-programming/why-an-engineering-manager-should-not-review-code-46f87c08db66?source=read_next_recirc-----ba9e72c66408----2---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="ab q">8 min read·6 days ago</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fbetter-programming%2F46f87c08db66&amp;operation=register&amp;redirect=https%3A%2F%2Fbetterprogramming.pub%2Fwhy-an-engineering-manager-should-not-review-code-46f87c08db66&amp;user=Emily+Dresner&amp;userId=4bc701481c2c&amp;source=-----46f87c08db66----2-----------------clap_footer----3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/better-programming/why-an-engineering-manager-should-not-review-code-46f87c08db66?source=read_next_recirc-----ba9e72c66408----2---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vg vh vi vj vk vl vm vn vo vp vq vr vs"><div class="vt"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/trendyol-tech/delivering-real-time-notifications-to-over-300k-sellers-with-server-sent-events-on-growth-center-95e180c486bc?source=read_next_recirc-----ba9e72c66408----3---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="vv vw vx vy vz"><img alt="" class="bg wa wb wc wd" src="https://miro.medium.com/v2/resize:fit:1358/0*h_IStxQfI0XTDBwy.jpg" role="presentation" /></div></a></div><div class="vu ab ca cn"><div class="we wf wg wh wi ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@dmldemirr?source=read_next_recirc-----ba9e72c66408----3---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="l ff"><img alt="Damla Demir" class="l fa bx wj wk cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*CDEEMzuCwqKfNfHcY7bXVQ.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wl l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@dmldemirr?source=read_next_recirc-----ba9e72c66408----3---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Damla Demir</p></a></div></div></div><div class="wl l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/trendyol-tech?source=read_next_recirc-----ba9e72c66408----3---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Trendyol Tech</p></a></div></div></div></div><div class="wm wn wo wp wq wr ws wt wu wv l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/trendyol-tech/delivering-real-time-notifications-to-over-300k-sellers-with-server-sent-events-on-growth-center-95e180c486bc?source=read_next_recirc-----ba9e72c66408----3---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div title="Delivering Real-Time Notifications to over 300K Sellers With Server-Sent Events on Growth Center"><h2 class="be gr oi ok ww wx ol om oo wy wz op nc xa xb xc xd ng xe xf xg xh nk xi xj xk xl jh jj jk jm jo bj">Delivering Real-Time Notifications to over 300K Sellers With Server-Sent Events on Growth Center</h2></div><div class="xm l"><h3 class="be b iq z jh xn jj jk xo jm jo dt">This article will discuss how we elevated user experience through server-sent event implementation for real-time updates on the client…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/trendyol-tech/delivering-real-time-notifications-to-over-300k-sellers-with-server-sent-events-on-growth-center-95e180c486bc?source=read_next_recirc-----ba9e72c66408----3---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="ab q">12 min read·5 days ago</div></a><div class="xp xq xr xs xt l"><div class="ab co"><div class="am xu xv xw xx xy xz ya yb yc yd ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Ftrendyol-tech%2F95e180c486bc&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Ftrendyol-tech%2Fdelivering-real-time-notifications-to-over-300k-sellers-with-server-sent-events-on-growth-center-95e180c486bc&amp;user=Damla+Demir&amp;userId=cde03450c99b&amp;source=-----95e180c486bc----3-----------------clap_footer----3f4b88aa_f83e_4955_ac94_30c7c24c921b-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="ye l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/trendyol-tech/delivering-real-time-notifications-to-over-300k-sellers-with-server-sent-events-on-growth-center-95e180c486bc?source=read_next_recirc-----ba9e72c66408----3---------------------3f4b88aa_f83e_4955_ac94_30c7c24c921b-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yg yh"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article>]]></description>
      <link>https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408</link>
      <guid>https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408</guid>
      <pubDate>Wed, 17 May 2023 18:48:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Journey Platform: A low-code tool for creating interactive user workflows]]></title>
      <description><![CDATA[<article><div class="l"><div class="l"><section><div><div class="gj gk gl gm gn"><div class="ab ca"><div class="cb cc cd ce cf cg ch bg"><div class=""><div class="hr hs ht hu hv"><div class="speechify-ignore ab co"><div class="speechify-ignore bg l"><div class="hw hx hy hz ia ab"><div><div class="ab ib"><a rel="noopener follow" href="https://medium.com/@arjun.raman?source=post_page-----9954f51fa3f8--------------------------------"><div><div class="bl" aria-hidden="false"><div class="l ic id bx ie if"><div class="l ff"><img alt="Arjun Raman" class="l fa bx dc dd cw" src="https://miro.medium.com/v2/resize:fill:88:88/0*2WGqAeKadV1z3xZ2" width="44" height="44" /></div></div></div></div></a><a href="https://medium.com/airbnb-engineering?source=post_page-----9954f51fa3f8--------------------------------" rel="noopener follow"><div class="ij ab ff"><div><div class="bl" aria-hidden="false"><div class="l ik il bx ie im"><div class="l ff"><img alt="The Airbnb Tech Blog" class="l fa bx bq in cw" src="https://miro.medium.com/v2/resize:fill:48:48/1*MlNQKg-sieBGW5prWoe9HQ.jpeg" width="24" height="24" /></div></div></div></div></div></a></div></div><div class="bm bg l"><div class="ab"><div><div class="io ab q"><div class="ab q ip"><div class="ab q"><div><div class="bl" aria-hidden="false"><p class="be b iq ir bj"><a class="af ag ah ai aj ak al am an ao ap aq ar is" rel="noopener follow" href="https://medium.com/@arjun.raman?source=post_page-----9954f51fa3f8--------------------------------">Arjun Raman</a></p></div></div></div>·<p class="be b iq ir dt"><a class="iv iw ah ai aj ak al am an ao ap aq ar eu ix iy" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2F7b20e16d6d70&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fjourney-platform-a-low-code-tool-for-creating-interactive-user-workflows-9954f51fa3f8&amp;user=Arjun+Raman&amp;userId=7b20e16d6d70&amp;source=post_page-7b20e16d6d70----9954f51fa3f8---------------------post_header-----------">Follow</a></p></div></div></div></div><div class="l iz"><div class="ab cm ja jb jc"><div class="jd je ab"><div class="be b bf z dt ab jf">Published in<div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=post_page-----9954f51fa3f8--------------------------------" rel="noopener follow"><p class="be b bf z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div><div class="h k">·</div></div><div class="ab ae">9 min read<div class="jp jq l" aria-hidden="true">·</div>Just now</div></div></div></div></div><div class="ab co jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg"><div class="h k w fc fd q"><div class="kw l"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F9954f51fa3f8&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fjourney-platform-a-low-code-tool-for-creating-interactive-user-workflows-9954f51fa3f8&amp;user=Arjun+Raman&amp;userId=7b20e16d6d70&amp;source=-----9954f51fa3f8---------------------clap_footer-----------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div></div><div><div class="bl" aria-hidden="false"></div></div><div class="ab q kh ki kj kk kl km kn ko kp kq kr ks kt ku kv"><div class="h k"><div><div class="bl" aria-hidden="false"></div></div><div class="fa tt cm"><div class="l ae"><div class="ab ca"><div class="tu tv tw tx ty od ch bg"><div class="ab"><div class="bl bg" aria-hidden="false"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div><div class="bl" aria-hidden="false" aria-describedby="postFooterSocialMenu" aria-labelledby="postFooterSocialMenu"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div><p id="fa23" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Journey Platform: Low-code notification workflow platform that allows technical and non-technical users to create complex workflows through a simple drag and drop user interface.</p><p id="7a35" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">By: </strong><a class="af np" href="https://www.linkedin.com/in/arraman/" rel="noopener ugc nofollow" target="_blank">Arjun Raman</a>, <a class="af np" href="https://www.linkedin.com/in/dsalcoda/" rel="noopener ugc nofollow" target="_blank">Ken Snyder</a>, <a class="af np" href="https://www.linkedin.com/in/mengtingli1010/" rel="noopener ugc nofollow" target="_blank">Mengting Li</a></p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr ns"><picture></picture></div></div></figure><h1 id="f8d4" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Introduction</h1><p id="8048" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Effective communication hinges on delivering the right message, to the right audience, at the right time. At Airbnb, our goal is to engage our users — both guests and hosts — by delivering inspirational and informational notifications through various channels, such as email or in-app messages.</p><p id="99a0" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><a class="af np" rel="noopener" href="https://medium.com/airbnb-engineering/airbnbs-promotions-and-communications-platform-6266f1ffe2bd">Historically</a> at Airbnb, complex notification workflows have been solely managed by engineering teams, with each workflow requiring the deployment of code. As our platform evolved, we recognized the need for a low-code or no-code solution to streamline the creation of these intricate notification workflows. In response, the Marketing Technology team developed the Journey Platform, a powerful tool that enables non-technical users to build and deliver personalized notifications based on our users’ engagement with Airbnb.</p><p id="6817" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">The goals of the Journey Platform are:</p><ol class=""><li id="bb2a" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj">Empower users to easily create event-driven notification workflows using an intuitive drag and drop interface.</li><li id="9fd4" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Enable real-time execution of these notification workflows for timely and relevant communication.</li><li id="6fdf" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Offer a unified interface for managing transaction notifications, such as upcoming trip reminders and promotional notifications.</li><li id="2aa5" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Guarantee Service Level Agreements (SLAs) for processing various types of notification workflows, including transactional and promotional communications.</li><li id="60c6" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj">Reduce the time required to develop complex notification workflows.</li></ol><p id="6044" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Journey Platform allows users to iterate faster by allowing self-serve workflow creation. It has reduced the time taken to support a new use-case from 1–2 months to just 1–2 weeks.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr pt"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 1: Time saved in Journey Platform</em></figcaption></figure><h1 id="d104" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Overview</h1><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qa"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 2: Journey Platform architecture overview</em></figcaption></figure><p id="9272" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">The key components of the Journey Platform are:</p><ol class=""><li id="a4ec" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no pl pm pn bj"><strong class="mt gr">Journey Platform UI:</strong> <a class="af np" href="https://en.wikipedia.org/wiki/WYSIWYG" rel="noopener ugc nofollow" target="_blank">WYSIWYG</a> tool allows users to drag and drop components and create a workflow. The workflow definition is then converted to a custom <a class="af np" href="https://en.wikipedia.org/wiki/Domain-specific_language" rel="noopener ugc nofollow" target="_blank">DSL (Domain-specific language)</a> which can be interpreted and executed by the workflow orchestrator.</li><li id="05f8" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj"><strong class="mt gr">Workflow Orchestrator:</strong> Brain of the system, the workflow orchestrator takes in the workflow definition DSL from the UI. Once a workflow is launched, it listens for events from the event store that can start the execution of a workflow, interprets then parses the DSL to execute workflows on the workflow engine, and relies on the Action store to perform specific tasks.</li><li id="83b2" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no pl pm pn bj"><strong class="mt gr">Platform Store:</strong></li></ol><ul class=""><li id="5a6e" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no qb pm pn bj">Event Store: Pre-configured catalog of Kafka events which Journey Platform can listen to and trigger new executions of a workflow or pass events to existing workflow execution.</li><li id="0d88" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qb pm pn bj">Action store: Repository of predefined, specific-purpose functions allows users to perform various tasks, such as sending emails, push notifications, or emitting Kafka events. Custom actions can be defined and integrated into the tool, making them accessible to all Journey Platform users.</li><li id="e898" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qb pm pn bj">Attribute store: Central repository for essential data, such as user metadata (e.g. user’s geolocation, Airbnb search history, etc.) and contextual information. It supports decision-making in workflow branching processes by exposing these data as a parameter to set conditions upon through the parameter manager.</li><li id="30d5" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qb pm pn bj">Custom stores: Ability to create custom action or attribute stores which aren’t already defined in the platform.</li><li id="e6cb" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qb pm pn bj">Workflow Orchestrator: Brain of the system, the workflow orchestrator takes in the workflow definition DSL from the UI. Once a workflow is launched, it listens for events from the event store that can start the execution of a workflow, interprets then parses the DSL to execute workflows on the workflow engine, and relies on the Action store to perform specific tasks.</li></ul><h1 id="e6f7" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">User Interface</h1><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qc"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 3: Manipulating and connecting nodes in a Journey Platform workflow</em></figcaption></figure><p id="a692" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">When crafting the UI for the workflow automation system, we aimed to create a familiar and intuitive experience. Drawing inspiration from flow charts, productivity tools with “inspector panels,” and incorporating drag and drop functionality, we wanted a platform where users could start immediately without consulting the manual.</p><p id="d11f" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We also had a goal of using<a class="af np" href="https://www.nngroup.com/articles/progressive-disclosure/" rel="noopener ugc nofollow" target="_blank">progressive disclosure</a> to incrementally enable the full depth of the platform capabilities, while keeping it simple for users who only need a small subset of the features. By using sensible defaults, and moving more complex features into tabs and sub-screens, our advanced users could create unique solutions, going beyond the pre-planned use cases.</p><p id="01f8" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">To edit the graph, we leveraged <a class="af np" href="https://reactflow.dev/" rel="noopener ugc nofollow" target="_blank">React Flow</a>, an open-source library. This enabled us to display the graph, as well as provide basic operations like zooming, panning, moving, and connecting nodes. On top of this foundation, we added our custom node and edge components, along with drag and drop functionality for adding new nodes and an inspector panel for editing existing ones.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qd"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 4: The “node inspector” panel can show a variety of form inputs depending on the type of node.</em></figcaption></figure><p id="d3c7" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">To create the forms displayed in the inspector panel, we implemented a schema-based form system. This system provides a high level of flexibility, allowing us to declaratively specify the UI for specific node input/output fields as part of their type definitions. The system is built in a type-safe manner, making use of Thrift annotations and Java reflection. Based on the schema information and UI-specific annotations, the interface displays the appropriate form fields, help text, and validation, ensuring our UI is automatically up-to-date with the platform’s capabilities.</p><h1 id="020b" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Backend Design</h1><h2 id="a52e" class="qe og gq be oh qf qg dx ol qh qi dz op nc qj qk ql ng qm qn qo nk qp qq qr qs bj">Domain-specific language</h2><p id="ce72" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">DSL provides a high degree of flexibility and customization, allowing us to define the structure and behavior of the workflow. Instead of having to hardcode a workflow in the workflow engine, we instead have a generic workflow defined that can execute any DSL-based workflow. Nodes and edges make up a workflow, with nodes representing individual actions or tasks and edges defining the dependencies and relationships between them.</p><p id="63b8" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">The nodes and edges include all the necessary information to define a workflow such as inputs, outputs, and parameters passed between nodes. The DSL generated by the UI is passed to the workflow orchestrator, where the DSL parser executes it.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qc"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 5: Workflow with the translated DSL</em></figcaption></figure><figure class="nt nu nv nw nx ny"><div class="qt jh l ff"><figcaption class="pu pv pw nq nr px py be b bf z dt">DSL representation of a workflow</figcaption></div></figure><h2 id="4b49" class="qe og gq be oh qf qg dx ol qh qi dz op nc qj qk ql ng qm qn qo nk qp qq qr qs bj">Journey Stores</h2><p id="5afc" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">The events, attributes, and actions stores are an integral part of the backend, as they allow listening to events to start workflow executions, filter users, and execute tasks in the journey. All these components work together seamlessly to create a flexible and customizable backend that can be tailored to the specific needs of the platform.</p><h2 id="5f7e" class="qe og gq be oh qf qg dx ol qh qi dz op nc qj qk ql ng qm qn qo nk qp qq qr qs bj">Event Store</h2><p id="2419" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Journey Platform supports listening to different Kafka events and using them to trigger new executions of a workflow, or use the event to pass signals to a running execution. For example, start a new execution when a guest books a stay, pass a signal to a running execution when a user receives a push notification, etc. Similar to the action store, once an event is on-boarded, all the teams at Airbnb can use it.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qc"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 6: Start node with event trigger</em></figcaption></figure><h2 id="091d" class="qe og gq be oh qf qg dx ol qh qi dz op nc qj qk ql ng qm qn qo nk qp qq qr qs bj">Attribute Store</h2><p id="7247" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">The attribute store functions as a central repository for fetching all necessary data, such as contextual data, user preferences, and device information, which can be used to enrich the workflow branching process and improve decision-making capabilities. These stores are supported by a data storage system that manages various attributes or characteristics of entities.</p><p id="27c3" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Imagine you have a new user who just signed up for Airbnb, and you’re interested in determining whether they’ve conducted any listing searches on the platform. If the answer is yes, you’ll send a personalized message based on their search history, and if it’s no, you’ll send a static message.</p><p id="7775" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">This is a concrete example of how the Airbnb Journey Platform leverages attributes, such as “listing search history,” to enhance the user experience. These attributes are extracted and defined as parameters, which can be used for various purposes. Each workflow execution has its own parameter data collection, which can be accessed in the parameter manager. More information about parameters will be discussed in the parameter manager.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qw"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 7: Setting filter condition using the attribute store</em></figcaption></figure><h2 id="1f9c" class="qe og gq be oh qf qg dx ol qh qi dz op nc qj qk ql ng qm qn qo nk qp qq qr qs bj">Action Store</h2><p id="6027" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">The action store is used to execute various tasks, such as sending an email or updating a database record, when a user reaches a specific point in the journey. It is a common library where each function can be shared and reused by different users in their workflow.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div class="nq nr qx"><picture></picture></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 8: Example Actions supported in Journey platform.</em></figcaption></figure><p id="d5e5" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Each action implements a common interface, including its metadata required for the UI schema-based forms mentioned above, and its behavior during the actual workflow execution.</p><figure class="nt nu nv nw nx ny"><div class="qt jh l ff"><figcaption class="pu pv pw nq nr px py be b bf z dt">Interface all the Actions must implement</figcaption></div></figure><h2 id="77e6" class="qe og gq be oh qf qg dx ol qh qi dz op nc qj qk ql ng qm qn qo nk qp qq qr qs bj">Parameter Manager</h2><p id="4cb9" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Managing a complex workflow that involves multiple steps with varying inputs and outputs can be a challenging task, especially if the input and output parameters change frequently or are different for each user. For instance, you might need conditional branching in your workflow or personalized communication content based on user search. This is where parameterized workflows and parameter managers can prove to be invaluable components.</p><p id="d54f" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">By specifying inputs and outputs (of attribute node / event node / custom node) as parameters, you can reuse them throughout the entire workflow execution. A parameter manager is a critical component that can store and manage your workflow parameters, streamlining the process of creating, storing, retrieving, and modifying them.</p><p id="66ce" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">In addition to providing an efficient parameter management system, a parameter manager also provides a range of features such as parameter creation, storage, retrieval, modification, versioning, access control, and auditing. These features ensure that your workflow is executed reliably and consistently while also properly managing and storing your parameters throughout the entire workflow.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qy"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 9: Adding a param from the parameter library</em></figcaption></figure><h1 id="7b0b" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Workflow Orchestrator</h1><p id="02ff" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">The Workflow Orchestrator executes workflows by interpreting the meaning of each DSL node and performing the corresponding actions. It manages low-level functions such as storing state, interacting with the action store to perform an action, listening for callbacks through the event store, and allowing developers to concentrate on workflow logic rather than technical details. Journey Platform utilizes <a class="af np" href="https://temporal.io/" rel="noopener ugc nofollow" target="_blank">Temporal</a> as the underlying workflow engine for state maintenance and orchestration. Temporal helps orchestrate workflows through <a class="af np" href="https://docs.temporal.io/workers" rel="noopener ugc nofollow" target="_blank">Temporal Workers.</a></p><figure class="nt nu nv nw nx ny"><div class="qt jh l ff"><figcaption class="pu pv pw nq nr px py be b bf z dt">DSL interpreter</figcaption></div></figure><p id="dc79" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Developers can incorporate custom functionality such as new nodes or edges to broaden platform capabilities, making it simpler to create workflows that fulfill the platform’s and users’ unique requirements. Additionally, it supports advanced features like parallel execution and automatic retries and enhancing platform reliability and performance.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qz"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 10: Workflow Orchestrator</em></figcaption></figure><h1 id="74e6" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Scaling the system</h1><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr ra"><picture></picture></div></div><figcaption class="pu pv pw nq nr px py be b bf z dt"><em class="pz">Figure 11: Multi-tenant system with dedicated processing lanes</em></figcaption></figure><p id="7c92" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Ensuring SLA for processing different types of workflows (i.e. transactional and promotional) is critical at scale. Transactional notifications initiated by user action (e.g. booking confirmation, guest/Host messaging, etc.) have a strict SLA and require higher priority when compared to promotional notifications. To achieve this, we have implemented the following at different parts of the system:</p><p id="9f94" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">Event pre-processing:</strong></p><ul class=""><li id="ca16" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no qb pm pn bj"><strong class="mt gr">Pre-filter:</strong> Instead of passing all the events directly to the Workflow Handler, the event processor filters out events that don’t match the criteria. e.g. only pass if the event type is reservation_complete and filter out for all other reservation events. This greatly reduces the QPS feeding into the system.</li><li id="f361" class="mr ms gq mt b mu po mw mx my pp na nb pi pq ne nf pj pr ni nj pk ps nm nn no qb pm pn bj"><strong class="mt gr">Aggregate high QPS events:</strong> Events like searches have a high QPS. Instead of directly processing, we batch and aggregate them over a time window. This reduces the QPS by at least a few orders of magnitude.</li></ul><p id="3743" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">Dedicated lanes:</strong></p><ul class=""><li id="88ec" class="mr ms gq mt b mu mv mw mx my mz na nb pi nd ne nf pj nh ni nj pk nl nm nn no qb pm pn bj">We have dedicated lanes for different categories of workflows through the system. The event listener has different consumer groups with built-in throttling. The workflow handler has dedicated <a class="af np" href="https://docs.temporal.io/namespaces" rel="noopener ugc nofollow" target="_blank">Temporal namespaces</a> for each category and strict limits on the processing QPS, max QPS to the database, etc.</li></ul><h1 id="e6e4" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Conclusion</h1><p id="0b71" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">The Journey Platform empowers non-technical and technical users to create complex stateful workflows through a simple drag and drop interface. By leveraging a generic workflow definition DSL, along with action store, event store, and attribute store, the platform facilitates the creation of workflows that respond to real-time events, streamlining communication, and enhancing user experiences.</p><p id="5f23" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><em class="rb">Interested in working at Airbnb? Check out </em><a class="af np" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"><em class="rb">these open roles</em></a><em class="rb">.</em></p><h1 id="1e56" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">Acknowledgments</h1><p id="b696" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj">Thanks to Balaji Kalaimani, Davis Wamola, Iris Feng, Jesse Garrison, John Bernardo, Kumar Arjunan, Michael Endelman, Priyank Singhal, Steve Krulewitz, Tej Sudha, Victoria Gryn, Xin Tu, and Zhentao Sun for their contributions in building Journey Platform.</p><p id="d2af" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Thanks to Sagar Naik and Michael Kinoti for their leadership and supporting us in this <em class="rb">Journey</em>.</p><h1 id="c0a8" class="of og gq be oh oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc bj">****************</h1><p id="c3ce" class="pw-post-body-paragraph mr ms gq mt b mu pd mw mx my pe na nb nc pf ne nf ng pg ni nj nk ph nm nn no gj bj"><em class="rb">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div></div></div></div></section></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----9954f51fa3f8----0---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="wg wh wi wj wk"><img alt="" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/1*X0-h_g8Qrt3TWzbOuBzzMw.jpeg" role="presentation" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=author_recirc-----9954f51fa3f8----0---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="l ff"><img alt="Lauren Mackevich" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*-imhApAGWwgM89i1.jpg" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=author_recirc-----9954f51fa3f8----0---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Lauren Mackevich</p></a></div></div></div><div class="ww l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----9954f51fa3f8----0---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----9954f51fa3f8----0---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div title=""><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">My Journey to Airbnb — Michael Kinoti</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">Saying no to med school and following a dream all the way to Silicon Valley</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----9954f51fa3f8----0---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="ab q">7 min read·Apr 26</div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F645d4c228d06&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-michael-kinoti-645d4c228d06&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057&amp;source=-----645d4c228d06----0-----------------clap_footer----fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----9954f51fa3f8----0---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----9954f51fa3f8----1---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="wg wh wi wj wk"><img alt="" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/0*CedYKpSYMIGEiX7m" role="presentation" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rbro112?source=author_recirc-----9954f51fa3f8----1---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="l ff"><img alt="Ryan Brooks" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*py_8uAIKHqAuW89G5PgOeQ.png" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@rbro112?source=author_recirc-----9954f51fa3f8----1---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Ryan Brooks</p></a></div></div></div><div class="ww l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----9954f51fa3f8----1---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----9954f51fa3f8----1---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div title=""><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">A Deep Dive into Airbnb’s Server-Driven UI System</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">How Airbnb ships features faster across web, iOS, and Android using a server-driven UI system named Ghost Platform ?.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----9954f51fa3f8----1---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="ab q">11 min read·Jun 29, 2021</div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F842244c5f5&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fa-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5&amp;user=Ryan+Brooks&amp;userId=4c31895f4c38&amp;source=-----842244c5f5----1-----------------clap_footer----fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----9954f51fa3f8----1---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/avoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb?source=author_recirc-----9954f51fa3f8----2---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="wg wh wi wj wk"><img alt="" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/1*vDoYk7bf-GgFBhcgDzRrGA.jpeg" role="presentation" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@jon.j.chew?source=author_recirc-----9954f51fa3f8----2---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="l ff"><img alt="Jon Chew" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*PDQg8XCdHlaVlivuxpDl4Q.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@jon.j.chew?source=author_recirc-----9954f51fa3f8----2---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Jon Chew</p></a></div></div></div><div class="ww l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----9954f51fa3f8----2---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/avoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb?source=author_recirc-----9954f51fa3f8----2---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div title=""><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">Avoiding Double Payments in a Distributed Payments System</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">How we built a generic idempotency framework to achieve eventual consistency and correctness across our payments micro-service…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/avoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb?source=author_recirc-----9954f51fa3f8----2---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="ab q"><div class="rt ab"><div class="bl" aria-hidden="false"></div>·14 min read·Apr 16, 2019</div></div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F2981f6b070bb&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Favoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb&amp;user=Jon+Chew&amp;userId=cc54ee66d513&amp;source=-----2981f6b070bb----2-----------------clap_footer----fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/avoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb?source=author_recirc-----9954f51fa3f8----2---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70?source=author_recirc-----9954f51fa3f8----3---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="wg wh wi wj wk"><img alt="" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/1*rB53PQsJi73IeA-eIeucIg.png" role="presentation" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rchang?source=author_recirc-----9954f51fa3f8----3---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="l ff"><img alt="Robert Chang" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*EguVA0HsIGqUy0gaDS1VgA.png" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@rchang?source=author_recirc-----9954f51fa3f8----3---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Robert Chang</p></a></div></div></div><div class="ww l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----9954f51fa3f8----3---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70?source=author_recirc-----9954f51fa3f8----3---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div title=""><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">How Airbnb Achieved Metric Consistency at Scale</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">Part-I: Introducing Minerva — Airbnb’s Metric Platform</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70?source=author_recirc-----9954f51fa3f8----3---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="ab q">12 min read·Apr 30, 2021</div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Ff23cc53dea70&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fhow-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70&amp;user=Robert+Chang&amp;userId=c00b242128fe&amp;source=-----f23cc53dea70----3-----------------clap_footer----fa35e0f9_959c_4af7_96c0_f205613fb96f-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70?source=author_recirc-----9954f51fa3f8----3---------------------fa35e0f9_959c_4af7_96c0_f205613fb96f-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/@TobiasCharles/how-to-be-a-manager-everyone-wants-to-work-for-40490d1acba5?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="wg wh wi wj wk"><img alt="" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/0*li_xOtXHajLjbrKf" role="presentation" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@TobiasCharles?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="l ff"><img alt="Tobias Charles" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*hM3f9lj4ejnc-fqf_Upz7g.png" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@TobiasCharles?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Tobias Charles</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/@TobiasCharles/how-to-be-a-manager-everyone-wants-to-work-for-40490d1acba5?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div title=""><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">How to Be a Manager Everyone Wants to Work for</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">Without relying on just being nice</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/@TobiasCharles/how-to-be-a-manager-everyone-wants-to-work-for-40490d1acba5?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="ab q"><div class="rt ab"><div class="bl" aria-hidden="false"></div>·5 min read·May 1</div></div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F40490d1acba5&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40TobiasCharles%2Fhow-to-be-a-manager-everyone-wants-to-work-for-40490d1acba5&amp;user=Tobias+Charles&amp;userId=371e714a81a8&amp;source=-----40490d1acba5----0-----------------clap_footer----644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@TobiasCharles/how-to-be-a-manager-everyone-wants-to-work-for-40490d1acba5?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/@scotthyoung/10-best-ways-to-use-chatgpt-with-examples-f2e5ba86de38?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="wg wh wi wj wk"><img alt="" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/0*64-xGsXkGlejrQtM" role="presentation" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@scotthyoung?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="l ff"><img alt="Scott H. Young" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/2*88Qdf_PKsdTYMipqHcYWtA.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@scotthyoung?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Scott H. Young</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/@scotthyoung/10-best-ways-to-use-chatgpt-with-examples-f2e5ba86de38?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div title=""><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">10 Best Ways To Use ChatGPT (With Examples)</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">Tips on how maximize your work and school performance with the latest AI tool.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/@scotthyoung/10-best-ways-to-use-chatgpt-with-examples-f2e5ba86de38?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="ab q"><div class="rt ab"><div class="bl" aria-hidden="false"></div>·10 min read·3 days ago</div></div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2Ff2e5ba86de38&amp;operation=register&amp;redirect=https%3A%2F%2Fscotthyoung.medium.com%2F10-best-ways-to-use-chatgpt-with-examples-f2e5ba86de38&amp;user=Scott+H.+Young&amp;userId=912c3d6e3387&amp;source=-----f2e5ba86de38----1-----------------clap_footer----644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@scotthyoung/10-best-ways-to-use-chatgpt-with-examples-f2e5ba86de38?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/gitconnected/how-to-create-a-video-summarizer-powered-by-ai-in-20-minutes-cbad2bf51254?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="wg wh wi wj wk"><img alt="" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/1*RTmJRrQzJRw1Ol3TMMETeg.png" role="presentation" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@wenbohuang0307?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="l ff"><img alt="Yeyu Huang" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*56MoILoyaRDA-_NLG2N8pg.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@wenbohuang0307?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Yeyu Huang</p></a></div></div></div><div class="ww l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/gitconnected?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Level Up Coding</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/gitconnected/how-to-create-a-video-summarizer-powered-by-ai-in-20-minutes-cbad2bf51254?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div title=""><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">How To Create A Video Summarizer Powered By AI, In 20 Minutes</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">A quick guide for building a website that generates summaries for any Youtube video</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/gitconnected/how-to-create-a-video-summarizer-powered-by-ai-in-20-minutes-cbad2bf51254?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="ab q"><div class="rt ab"><div class="bl" aria-hidden="false"></div>·10 min read·3 days ago</div></div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fgitconnected%2Fcbad2bf51254&amp;operation=register&amp;redirect=https%3A%2F%2Flevelup.gitconnected.com%2Fhow-to-create-a-video-summarizer-powered-by-ai-in-20-minutes-cbad2bf51254&amp;user=Yeyu+Huang&amp;userId=36da9d8de7e&amp;source=-----cbad2bf51254----0-----------------clap_footer----644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/gitconnected/how-to-create-a-video-summarizer-powered-by-ai-in-20-minutes-cbad2bf51254?source=read_next_recirc-----9954f51fa3f8----0---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/towards-data-science/getting-started-with-langchain-a-beginners-guide-to-building-llm-powered-applications-95fc8898732c?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="wg wh wi wj wk"><img alt="Two stochastic parrots sitting on a chain of large language models: LangChain" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/1*4C54ZxHRM1dOlAvlvoEJZg@2x.jpeg" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@iamleonie?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="l ff"><img alt="Leonie Monigatti" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*TTIl4oynrJyfIkLbC6fumA.png" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@iamleonie?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Leonie Monigatti</p></a></div></div></div><div class="ww l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/towards-data-science?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Towards Data Science</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/towards-data-science/getting-started-with-langchain-a-beginners-guide-to-building-llm-powered-applications-95fc8898732c?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div title="Getting Started with LangChain: A Beginner’s Guide to Building LLM-Powered Applications"><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">Getting Started with LangChain: A Beginner’s Guide to Building LLM-Powered Applications</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">A LangChain tutorial to build anything with large language models in Python</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/towards-data-science/getting-started-with-langchain-a-beginners-guide-to-building-llm-powered-applications-95fc8898732c?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="ab q"><div class="rt ab"><div class="bl" aria-hidden="false"></div>·12 min read·Apr 25</div></div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Ftowards-data-science%2F95fc8898732c&amp;operation=register&amp;redirect=https%3A%2F%2Ftowardsdatascience.com%2Fgetting-started-with-langchain-a-beginners-guide-to-building-llm-powered-applications-95fc8898732c&amp;user=Leonie+Monigatti&amp;userId=3a38da70d8dc&amp;source=-----95fc8898732c----1-----------------clap_footer----644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/towards-data-science/getting-started-with-langchain-a-beginners-guide-to-building-llm-powered-applications-95fc8898732c?source=read_next_recirc-----9954f51fa3f8----1---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----9954f51fa3f8----2---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="wg wh wi wj wk"><img alt="" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/1*y0vJwEfN45barnQO9jiYew.jpeg" role="presentation" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@frank-andrade?source=read_next_recirc-----9954f51fa3f8----2---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="l ff"><img alt="The PyCoach" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*veEX4-CiLz5jqUjwWfQo_Q.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@frank-andrade?source=read_next_recirc-----9954f51fa3f8----2---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">The PyCoach</p></a></div></div></div><div class="ww l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/artificial-corner?source=read_next_recirc-----9954f51fa3f8----2---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Artificial Corner</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----9954f51fa3f8----2---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div title="You’re Using ChatGPT Wrong! Here’s How to Be Ahead of 99% of ChatGPT Users"><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">You’re Using ChatGPT Wrong! Here’s How to Be Ahead of 99% of ChatGPT Users</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">Master ChatGPT by learning prompt engineering.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----9954f51fa3f8----2---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="ab q"><div class="rt ab"><div class="bl" aria-hidden="false"></div>·7 min read·Mar 17</div></div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fartificial-corner%2F886a50dabc54&amp;operation=register&amp;redirect=https%3A%2F%2Fartificialcorner.com%2Fyoure-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54&amp;user=The+PyCoach&amp;userId=fb44e21903f3&amp;source=-----886a50dabc54----2-----------------clap_footer----644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----9954f51fa3f8----2---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv rt l"><div class="bg dv"><div class="dv l"><div class="dv vr vs vt vu vv vw vx vy vz wa wb wc wd"><div class="we"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/better-programming/faster-data-experimentation-with-cookiecutter-14a80f1859cd?source=read_next_recirc-----9954f51fa3f8----3---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="wg wh wi wj wk"><img alt="" class="bg wl wm wn wo" src="https://miro.medium.com/v2/resize:fit:1358/0*zAXvoC2SceDxvzBP.jpg" role="presentation" /></div></a></div><div class="wf ab ca cn"><div class="wp wq wr ws wt ab"><div class="rf l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@petrica.leuca?source=read_next_recirc-----9954f51fa3f8----3---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="l ff"><img alt="Petrica Leuca" class="l fa bx wu wv cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*eEK1w5k-25C0PFupAN0zDg.jpeg" width="20" height="20" /></div></a></div></div></div><div class="ww l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@petrica.leuca?source=read_next_recirc-----9954f51fa3f8----3---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Petrica Leuca</p></a></div></div></div><div class="ww l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/better-programming?source=read_next_recirc-----9954f51fa3f8----3---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Better Programming</p></a></div></div></div></div><div class="wx wy wz xa xb xc xd xe xf xg l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/better-programming/faster-data-experimentation-with-cookiecutter-14a80f1859cd?source=read_next_recirc-----9954f51fa3f8----3---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div title=""><h2 class="be gr oi ok xh xi ol om oo xj xk op nc qk xl xm ql ng qn xn xo qo nk qq xp xq qr jh jj jk jm jo bj">Faster Data Experimentation With “cookiecutter”</h2></div><div class="xr l"><h3 class="be b iq z jh xs jj jk xt jm jo dt">Speed up development and enforce standards</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/better-programming/faster-data-experimentation-with-cookiecutter-14a80f1859cd?source=read_next_recirc-----9954f51fa3f8----3---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="ab q"><div class="rt ab"><div class="bl" aria-hidden="false"></div>·7 min read·Apr 3</div></div></a><div class="xu xv xw xx xy l"><div class="ab co"><div class="am xz ya yb yc yd ye yf yg yh yi ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fbetter-programming%2F14a80f1859cd&amp;operation=register&amp;redirect=https%3A%2F%2Fbetterprogramming.pub%2Ffaster-data-experimentation-with-cookiecutter-14a80f1859cd&amp;user=Petrica+Leuca&amp;userId=988ae1b5eb22&amp;source=-----14a80f1859cd----3-----------------clap_footer----644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="yj l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/better-programming/faster-data-experimentation-with-cookiecutter-14a80f1859cd?source=read_next_recirc-----9954f51fa3f8----3---------------------644e6c93_2ccc_4854_aa1e_01ce2ee64d94-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q yl ym"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article>]]></description>
      <link>https://medium.com/airbnb-engineering/journey-platform-a-low-code-tool-for-creating-interactive-user-workflows-9954f51fa3f8</link>
      <guid>https://medium.com/airbnb-engineering/journey-platform-a-low-code-tool-for-creating-interactive-user-workflows-9954f51fa3f8</guid>
      <pubDate>Thu, 11 May 2023 21:13:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Flexible Continuous Integration for iOS]]></title>
      <description><![CDATA[<article><div class="l"><div class="l"><section><div><div class="gj gk gl gm gn"><div class="ab ca"><div class="cb cc cd ce cf cg ch bg"><div class=""><div class="hr hs ht hu hv"><div class="speechify-ignore ab co"><div class="speechify-ignore bg l"><div class="hw hx hy hz ia ab"><div><div class="ab ib"><a rel="noopener follow" href="https://medium.com/@michaelbachand?source=post_page-----4ab33ea4072f--------------------------------"><div><div class="bl" aria-hidden="false"><div class="l ic id bx ie if"><div class="l ff"><img alt="Michael Bachand" class="l fa bx dc dd cw" src="https://miro.medium.com/v2/resize:fill:88:88/1*hRzU_BfPKFdM77OfC1eYQw.jpeg" width="44" height="44" /></div></div></div></div></a><a href="https://medium.com/airbnb-engineering?source=post_page-----4ab33ea4072f--------------------------------" rel="noopener follow"><div class="ij ab ff"><div><div class="bl" aria-hidden="false"><div class="l ik il bx ie im"><div class="l ff"><img alt="The Airbnb Tech Blog" class="l fa bx bq in cw" src="https://miro.medium.com/v2/resize:fill:48:48/1*MlNQKg-sieBGW5prWoe9HQ.jpeg" width="24" height="24" /></div></div></div></div></div></a></div></div><div class="bm bg l"><div class="ab"><div><div class="io ab q"><div class="ab q ip"><div class="ab q"><div><div class="bl" aria-hidden="false"><p class="be b iq ir bj"><a class="af ag ah ai aj ak al am an ao ap aq ar is" rel="noopener follow" href="https://medium.com/@michaelbachand?source=post_page-----4ab33ea4072f--------------------------------">Michael Bachand</a></p></div></div></div>·<p class="be b iq ir dt"><a class="iv iw ah ai aj ak al am an ao ap aq ar eu ix iy" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2F90f72207e307&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fflexible-continuous-integration-for-ios-4ab33ea4072f&amp;user=Michael+Bachand&amp;userId=90f72207e307&amp;source=post_page-90f72207e307----4ab33ea4072f---------------------post_header-----------">Follow</a></p></div></div></div></div><div class="l iz"><div class="ab cm ja jb jc"><div class="jd je ab"><div class="be b bf z dt ab jf">Published in<div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=post_page-----4ab33ea4072f--------------------------------" rel="noopener follow"><p class="be b bf z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div><div class="h k">·</div></div><div class="ab ae">10 min read<div class="jp jq l" aria-hidden="true">·</div>Just now</div></div></div></div></div><div class="ab co jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg"><div class="h k w fc fd q"><div class="kw l"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F4ab33ea4072f&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fflexible-continuous-integration-for-ios-4ab33ea4072f&amp;user=Michael+Bachand&amp;userId=90f72207e307&amp;source=-----4ab33ea4072f---------------------clap_footer-----------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div></div><div><div class="bl" aria-hidden="false"></div></div><div class="ab q kh ki kj kk kl km kn ko kp kq kr ks kt ku kv"><div class="h k"><div><div class="bl" aria-hidden="false"></div></div><div class="fa th cm"><div class="l ae"><div class="ab ca"><div class="ti tj tk tl tm od ch bg"><div class="ab"><div class="bl bg" aria-hidden="false"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div><div class="bl" aria-hidden="false" aria-describedby="postFooterSocialMenu" aria-labelledby="postFooterSocialMenu"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div><p id="5a09" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><em class="np">How Airbnb leverages AWS, Packer, and Terraform to update macOS on hundreds Cl machines in hours instead of days</em></p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr ns"><picture></picture></div></div></figure><p id="72e8" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><strong class="mt gr">By:</strong> <a class="af of" href="https://www.linkedin.com/in/mbachand" rel="noopener ugc nofollow" target="_blank">Michael Bachand</a>, <a class="af of" href="https://www.linkedin.com/in/xianwen1014" rel="noopener ugc nofollow" target="_blank">Xianwen Chen</a></p><p id="51b9" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">At Airbnb, we run a comprehensive suite of continuous integration (CI) jobs before each iOS code change is merged. These jobs ensure that the main branch remains stable by executing critical developer workflows like building the iOS application and running tests. We also schedule jobs that perform periodic tasks like reporting metrics and uploading artifacts.</p><p id="1edb" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Many of our iOS CI jobs execute on Macs, which enables running developer tools provided by Apple. CI jobs for all other platforms at Airbnb execute in containers on Amazon EC2 Linux instances. To fulfill the macOS requirement of iOS CI jobs we have historically maintained alternate CI infrastructure outside of AWS specifically for iOS development. The <a class="af of" href="https://aws.amazon.com/about-aws/whats-new/2020/11/announcing-amazon-ec2-mac-instances-for-macos/" rel="noopener ugc nofollow" target="_blank">introduction of Macs</a> to AWS provided an opportunity for us to rethink our approach to iOS CI.</p><p id="405b" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We designed the next iteration of our iOS CI system in late 2021, finished the migration to the new system in mid 2022, and polished the system through the end of 2022. CI for iOS and all other platforms at Airbnb already leveraged Buildkite for dispatching jobs. Now, we deploy iOS CI infrastructure to AWS using Terraform, which helps align CI for iOS with CI for other platforms at Airbnb.</p><p id="a4c6" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">In this article, we are excited to share with you details of the flexible and easy-to-maintain iOS CI system that we’ve implemented with Amazon EC2 Mac instances.</p><h1 id="45c3" class="og oh gq be oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bj">The Challenges with Running CI on Physical Macs</h1><p id="965d" class="pw-post-body-paragraph mr ms gq mt b mu pe mw mx my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no gj bj">Historically we ran Airbnb iOS CI on physical Macs. We enjoyed the speed of running CI without virtualization but we paid a substantial maintenance cost to run CI jobs directly on physical hardware. An iOS infrastructure engineer individually logged into over 300 machines to perform administrative tasks like enrolling the Mac in our MDM (Mobile Device Management) tool and upgrading macOS. Manual maintenance requirements limited the scalability of the fleet and consumed engineer time that could be better spent on higher-value projects.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr pj"><picture></picture></div></div><figcaption class="pk pl pm nq nr pn po be b bf z dt">An engineer remotely updates multiple physical Macs to macOS Big Sur. EC2 macOS AMIs have eliminated this manual work.</figcaption></figure><p id="f805" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Our old CI machines were rarely restarted and too often drifted into a bad state. When this occurred, the best-case scenario was that an engineer could log into the machine, diagnose what configuration drift was causing issues, and manually bring the machine back to a good state. More commonly, we shut down the corrupted machine so that it could no longer accept new CI jobs. Periodically, we asked the vendor who managed our physical Macs to restore the corrupted machines to a clean installation of macOS. When the machines eventually came back online, we manually re-enrolled each machine in MDM to bring our fleet back to its full capacity.</p><p id="0d91" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Updating to a new version of Xcode was quite error-prone as well. We strive to roll out new Xcode versions regularly since many iOS engineers at Airbnb follow Swift and Xcode releases closely and are eager to adopt new language features and IDE improvements. However, the fixed capacity of our Mac fleet made it difficult for us to verify iOS CI jobs thoroughly against new versions; any machine allocated to testing a new version of Xcode could no longer accept CI jobs from the previous Xcode version. The risk of tackling each Xcode update was increased by the fact that rolling back to a previous version of Xcode across our fleet was not practical.</p><h1 id="4443" class="og oh gq be oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bj">Upgrading CI with Custom macOS AMIs</h1><p id="5022" class="pw-post-body-paragraph mr ms gq mt b mu pe mw mx my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no gj bj">When evaluating AWS, we were excited by the possibility of launching instances from Amazon Machine Images (AMIs). An <a class="af of" href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html" rel="noopener ugc nofollow" target="_blank">AMI</a> is a snapshot of an instance’s state, including its file system contents and other metadata. Amazon provides base AMIs for each macOS version and allows customers to create their own AMIs from running instances.</p><p id="e3ee" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">AMIs allow us to add new instances to our fleet without human intervention. An EC2 Mac bare-metal instance launched from a properly configured AMI is immediately ready to accept new work after initialization. When updating macOS, we no longer need to log into every machine in our fleet. Instead, we log into a single instance launched from the Amazon base AMI for the new macOS version. After performing a handful of manual configuration steps, like enabling <a class="af of" href="https://support.apple.com/en-us/HT201476" rel="noopener ugc nofollow" target="_blank">automatic login</a>, we create an Airbnb base AMI from that instance.</p><p id="d6e6" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Initially, we powered our EC2 Mac fleet with manually created AMIs. An engineer would configure a single instance and create an AMI from that instance’s state. Then we could launch any number of additional instances from that AMI. This was a major improvement over managing physical machines since we could spin up an entire fleet of identical instances after configuring only a single instance successfully.</p><p id="365b" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Now, we <a class="af of" href="https://aws.amazon.com/blogs/compute/building-amazon-machine-images-amis-for-ec2-mac-instances-with-packer/" rel="noopener ugc nofollow" target="_blank">build AMIs using Packer</a>. Packer programmatically launches and configures an EC2 instance using a template defined in the HashiCorp configuration language (HCL). Packer then creates an AMI from the configured EC2 instance. A Ruby wrapper script invokes Packer consistently and performs helpful validations like checking that the user has assumed the proper AWS role. We check the HCL template code into source control and all changes to our Packer template and companion scripts are made via GitHub pull requests.</p><figure class="nt nu nv nw nx ny"><div class="pp jh l ff"><figcaption class="pk pl pm nq nr pn po be b bf z dt">Timing statistics for creating a new Arm AMI with Packer. This command ran on an EC2 mac2.metal instance.</figcaption></div></figure><p id="c80a" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We initially ran Packer from developer laptops, but the laptop needed to be awake and online for the duration of the Packer build. Eventually, we created a dedicated pipeline to build AMIs in the cloud. A developer can trigger a new build on this pipeline with a couple of clicks. A successful build will produce freshly baked and verified AMIs for both the x86 and Arm (Apple Silicon) CPU architectures within a few hours.</p><h1 id="62d6" class="og oh gq be oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bj">Defining CI Environments in Terraform</h1><p id="2c0e" class="pw-post-body-paragraph mr ms gq mt b mu pe mw mx my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no gj bj">Our new CI system leveraging these AMIs consists of many environments, each of which can be managed independently. The central AWS component of each CI environment is an <a class="af of" href="https://docs.aws.amazon.com/autoscaling/ec2/userguide/auto-scaling-groups.html" rel="noopener ugc nofollow" target="_blank">Auto Scaling group</a>, which is responsible for launching the EC2 Mac instances. The number of instances in the Auto Scaling group is determined by the <a class="af of" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_group#desired_capacity" rel="noopener ugc nofollow" target="_blank">desired capacity</a> property on the group and is bounded by <a class="af of" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_group#min_size" rel="noopener ugc nofollow" target="_blank">min</a> and <a class="af of" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_group#max_size" rel="noopener ugc nofollow" target="_blank">max</a> size properties.</p><p id="15cf" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">An Auto Scaling group creates new instances using a <a class="af of" href="https://docs.aws.amazon.com/autoscaling/ec2/userguide/launch-templates.html" rel="noopener ugc nofollow" target="_blank">launch template</a>. The launch template specifies the configuration of each instance, including the AMI, and allows a “user data” script to run when the instance is launched. Launch templates can be versioned, and each Auto Scaling group is configured to launch instances from a specific version of its launch template.</p><p id="74ee" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Although the introduction of environments has made our CI topology more complex, we find that complexity manageable when our infrastructure is defined in code. All of our AWS infrastructure for iOS CI is specified in <a class="af of" href="https://developer.hashicorp.com/terraform/language" rel="noopener ugc nofollow" target="_blank">Terraform</a> code that we check into source control. Each time we merge a pull request related to iOS CI, Terraform Enterprise will automatically apply our changes to our AWS account. We have defined a Terraform module that we can call whenever we want to instantiate a new CI environment.</p><figure class="nt nu nv nw nx ny"><div class="pp jh l ff"><figcaption class="pk pl pm nq nr pn po be b bf z dt">Calling a Terraform module to create a CI environment of Arm Mac Minis with Xcode 14.2 installed.</figcaption></div></figure><p id="c029" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">An internal scaling service manages the desired capacity of each environment’s Auto Scaling group. This service, a modified fork of <a class="af of" href="https://github.com/buildkite/buildkite-agent-scaler" rel="noopener ugc nofollow" target="_blank">buildkite-agent-scaler</a>, increases the desired capacity of an environment’s Auto Scaling group as CI job volume for that environment increases. We specify a maximum number of instances for each CI environment in part because On-Demand EC2 Mac Dedicated Hosts currently have a minimum host allocation and billing duration of 24 hours.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr ps"><picture></picture></div></div><figcaption class="pk pl pm nq nr pn po be b bf z dt">A sketch of Airbnb’s new iOS CI system.</figcaption></figure><p id="57c0" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Each CI environment has a unique Buildkite queue name. Individual CI jobs can target instances in a specific environment by specifying the corresponding queue name. Jobs will fall back to the default CI environment when no queue name is explicitly specified.</p><h1 id="4890" class="og oh gq be oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bj">Benefits of Our New iOS CI System</h1><h2 id="7d1f" class="pt oh gq be oi pu pv dx om pw px dz oq nc py pz qa ng qb qc qd nk qe qf qg qh bj">CI Environments Are Highly Flexible</h2><p id="81c0" class="pw-post-body-paragraph mr ms gq mt b mu pe mw mx my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no gj bj">With this new Terraform setup we are able to support an arbitrary number of CI environments with minimal overhead. We create a new CI environment per CPU architecture and version of Xcode. We can even duplicate these environments across multiple versions of macOS when performing an operating system update across our fleet. We use dedicated staging environments to test CI jobs on instances launched from a new AMI before we roll out that AMI broadly.</p><p id="2020" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">When we are no longer regularly using a CI environment, we can specify a minimum capacity of zero when calling the Terraform module, which will set the same value on the underlying Auto Scaling group. Then the Auto Scaling group will only launch instances when its desired capacity is increased by the scaling service. In practice, we tend to delete older environments from our Terraform code. However, even once an environment has been wound down, reinstating that environment is as simple as reverting a couple of commits in Git and redeploying the scaling service.</p><h2 id="7b37" class="pt oh gq be oi pu pv dx om pw px dz oq nc py pz qa ng qb qc qd nk qe qf qg qh bj">Rotation of Instances Increases CI Consistency</h2><p id="1589" class="pw-post-body-paragraph mr ms gq mt b mu pe mw mx my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no gj bj">To minimize the opportunity for EC2 instances to drift, we terminate all instances each night and replace them daily. This way, we can be confident that our CI fleet is in a known good state at the start of each day.</p><p id="0e5f" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">When an instance is terminated, the underlying dedicated host is <a class="af of" href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-mac-instances.html#mac-instance-stop" rel="noopener ugc nofollow" target="_blank">scrubbed</a> before a new instance can be launched on that host. We terminate instances at a time when CI demand is low to allow for the EC2 Mac scrubbing process to complete before we need to launch fresh instances on the same hosts. When an instance terminates itself overnight, it will decrement the desired capacity of the Auto Scaling group to which it belongs. As engineers start pushing commits the next day, the scaling service will increment the desired capacity on the appropriate Auto Scaling groups, causing new instances to be launched.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qi"><picture></picture></div></div><figcaption class="pk pl pm nq nr pn po be b bf z dt">Instances terminate themselves overnight. We reduce our maximum capacity over weekends. The spikes in job volume that increased capacity on the 2nd, 6th, and 7th have been hidden by smoothing in the chart.</figcaption></figure><p id="79ff" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">When an instance does experience configuration drift, we can disconnect that instance from Buildkite with one click. The instance will remain running but will no longer accept new CI jobs. An engineer can log into the instance to investigate its state until the instance is eventually terminated at the end of the day. To keep overall CI capacity stable, we can manually add an additional instance to our fleet, or a replacement will be launched automatically if we terminate the instance early.</p><h2 id="7ff5" class="pt oh gq be oi pu pv dx om pw px dz oq nc py pz qa ng qb qc qd nk qe qf qg qh bj">We Ship Xcode Versions More Quickly</h2><p id="ba67" class="pw-post-body-paragraph mr ms gq mt b mu pe mw mx my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no gj bj">We appreciate the new capabilities of our upgraded CI system. We can lease additional Dedicated Hosts from Amazon on demand to weather unexpected spikes in CI usage and to test software updates thoroughly. We roll out new AMIs gradually and can roll back painlessly if we encounter unexpected issues.</p><figure class="nt nu nv nw nx ny nq nr paragraph-image"><div role="button" tabindex="0" class="nz oa ff ob bg oc"><div class="nq nr qi"><picture></picture></div></div><figcaption class="pk pl pm nq nr pn po be b bf z dt">CI jobs shift from Xcode 14.1 to 14.2. On the 24th, we temporarily increased 14.2 capacity to accommodate a spike in jobs.</figcaption></figure><p id="d65a" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Together, these capabilities get Airbnb iOS developers access to Swift language features and Xcode IDE improvements more quickly. In fact, with the tailwind of our new CI system, we have seen the pace at which we update Xcode increase by over 20%. As of the time of writing, we have internally rolled out all available major and minor versions of Xcode 14 (14.0–14.3) as they have been released.</p><h1 id="1f92" class="og oh gq be oi oj ok ol om on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd bj">The Migration is Complete</h1><p id="ebc0" class="pw-post-body-paragraph mr ms gq mt b mu pe mw mx my pf na nb nc pg ne nf ng ph ni nj nk pi nm nn no gj bj">Our new CI system ran over 10 million minutes of CI jobs in the last three months of 2022. After upgrading to EC2, we spend meaningfully fewer hours on maintenance despite a growing codebase and consistently high job volume. Our newfound ability to scale CI to meet the evolving needs of the Airbnb iOS community justifies the increased complexity of the rebuilt system.</p><p id="b077" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">After the migration to AWS, iOS CI benefits more from shared infrastructure that is already being used successfully within Airbnb. For example, the new iOS CI architecture enabled us to avoid implementing an iOS-specific solution for automatically scaling capacity. Instead, we leverage the aforementioned fork of <a class="af of" href="https://github.com/buildkite/buildkite-agent-scaler" rel="noopener ugc nofollow" target="_blank">buildkite-agent-scaler</a> that Airbnb engineers had already converted to an internal Airbnb service complete with a dedicated deployment pipeline. Additionally, we used existing Terraform modules that are maintained by other teams to integrate with <a class="af of" href="https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html" rel="noopener ugc nofollow" target="_blank">IAM</a> and <a class="af of" href="https://docs.aws.amazon.com/systems-manager/latest/userguide/what-is-systems-manager.html" rel="noopener ugc nofollow" target="_blank">SSM</a>.</p><p id="1ef3" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We have found that EC2 Mac instances launched from custom AMIs provide many of the benefits of virtualization without the performance penalty of executing within a virtual machine. We consider AWS, Packer, and Terraform to be essential technologies for building a flexible CI system for large-scale iOS development in 2023.</p></div></div></div><div class="ab ca qj qk ql qm" role="separator"><div class="gj gk gl gm gn"><div class="ab ca"><div class="cb cc cd ce cf cg ch bg"><p id="ac66" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">Xianwen Chen, the technical lead of this project, designed the topology of the iOS CI system, implemented the design with Terraform, and later enabled creation of AMIs in the cloud. Michael Bachand built the initial version of our Packer tooling and used this tooling to create the first programmatically built AMIs capable of completing iOS CI jobs. Steven Hepting productionized our Packer tooling by adding support for Arm AMIs and evolving the Packer template so that all of Airbnb’s iOS CI jobs could run successfully on both CPU architectures.</p><p id="a5d5" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">We received invaluable support from numerous subject-matter experts at Airbnb who were very generous with their time. Many thanks to Brandon Kurtz for advising on content and voice through multiple revisions of this article.</p><p id="64cd" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj">If you are interested in joining us on our quest to make the best iOS app in the App Store, please see our <a class="af of" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">careers</a> page for open iOS roles.</p><p id="8a5f" class="pw-post-body-paragraph mr ms gq mt b mu mv mw mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no gj bj"><em class="np">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div></div></div></div></div></section></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/designing-for-productivity-in-a-large-scale-ios-application-9376a430a0bf?source=author_recirc-----4ab33ea4072f----0---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="vu vv vw vx vy"><img alt="" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/1*YoPXkM2HntwCXznpl96Ouw.png" role="presentation" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@michaelbachand?source=author_recirc-----4ab33ea4072f----0---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="l ff"><img alt="Michael Bachand" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*hRzU_BfPKFdM77OfC1eYQw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@michaelbachand?source=author_recirc-----4ab33ea4072f----0---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Michael Bachand</p></a></div></div></div><div class="wk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----4ab33ea4072f----0---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/designing-for-productivity-in-a-large-scale-ios-application-9376a430a0bf?source=author_recirc-----4ab33ea4072f----0---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div title=""><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">Designing for Productivity in a Large-Scale iOS Application</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">How innovation in technology and people processes have enabled iOS developers to remain productive in a large codebase.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/designing-for-productivity-in-a-large-scale-ios-application-9376a430a0bf?source=author_recirc-----4ab33ea4072f----0---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="ab q">12 min read·Oct 5, 2021</div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F9376a430a0bf&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fdesigning-for-productivity-in-a-large-scale-ios-application-9376a430a0bf&amp;user=Michael+Bachand&amp;userId=90f72207e307&amp;source=-----9376a430a0bf----0-----------------clap_footer----e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/designing-for-productivity-in-a-large-scale-ios-application-9376a430a0bf?source=author_recirc-----4ab33ea4072f----0---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----4ab33ea4072f----1---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="vu vv vw vx vy"><img alt="" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/1*X0-h_g8Qrt3TWzbOuBzzMw.jpeg" role="presentation" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=author_recirc-----4ab33ea4072f----1---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="l ff"><img alt="Lauren Mackevich" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/0*-imhApAGWwgM89i1.jpg" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=author_recirc-----4ab33ea4072f----1---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Lauren Mackevich</p></a></div></div></div><div class="wk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----4ab33ea4072f----1---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----4ab33ea4072f----1---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div title=""><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">My Journey to Airbnb — Michael Kinoti</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">Saying no to med school and following a dream all the way to Silicon Valley</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----4ab33ea4072f----1---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="ab q">7 min read·Apr 26</div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F645d4c228d06&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-michael-kinoti-645d4c228d06&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057&amp;source=-----645d4c228d06----1-----------------clap_footer----e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06?source=author_recirc-----4ab33ea4072f----1---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----4ab33ea4072f----2---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="vu vv vw vx vy"><img alt="" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/0*CedYKpSYMIGEiX7m" role="presentation" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rbro112?source=author_recirc-----4ab33ea4072f----2---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="l ff"><img alt="Ryan Brooks" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*py_8uAIKHqAuW89G5PgOeQ.png" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@rbro112?source=author_recirc-----4ab33ea4072f----2---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Ryan Brooks</p></a></div></div></div><div class="wk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----4ab33ea4072f----2---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----4ab33ea4072f----2---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div title=""><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">A Deep Dive into Airbnb’s Server-Driven UI System</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">How Airbnb ships features faster across web, iOS, and Android using a server-driven UI system named Ghost Platform ?.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----4ab33ea4072f----2---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="ab q">11 min read·Jun 29, 2021</div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F842244c5f5&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fa-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5&amp;user=Ryan+Brooks&amp;userId=4c31895f4c38&amp;source=-----842244c5f5----2-----------------clap_footer----e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----4ab33ea4072f----2---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-an-inclusive-codebase-bbaa2315e5b8?source=author_recirc-----4ab33ea4072f----3---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="vu vv vw vx vy"><img alt="" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/1*aLIlIJy0RsbE0KI8xdVGKQ.jpeg" role="presentation" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@michaelbachand?source=author_recirc-----4ab33ea4072f----3---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="l ff"><img alt="Michael Bachand" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*hRzU_BfPKFdM77OfC1eYQw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@michaelbachand?source=author_recirc-----4ab33ea4072f----3---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Michael Bachand</p></a></div></div></div><div class="wk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----4ab33ea4072f----3---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-an-inclusive-codebase-bbaa2315e5b8?source=author_recirc-----4ab33ea4072f----3---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div title=""><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">Building an Inclusive Codebase</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">Our playbook for driving down non-inclusive terminology</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-an-inclusive-codebase-bbaa2315e5b8?source=author_recirc-----4ab33ea4072f----3---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="ab q">6 min read·Jun 15, 2021</div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fbbaa2315e5b8&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fbuilding-an-inclusive-codebase-bbaa2315e5b8&amp;user=Michael+Bachand&amp;userId=90f72207e307&amp;source=-----bbaa2315e5b8----3-----------------clap_footer----e877f9d2_f467_4871_821d_d2f603f689f0-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-an-inclusive-codebase-bbaa2315e5b8?source=author_recirc-----4ab33ea4072f----3---------------------e877f9d2_f467_4871_821d_d2f603f689f0-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/@james.daniel.isaiah/combine-with-uikit-taking-small-steps-forward-1522e1da5ebf?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="vu vv vw vx vy"><img alt="" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/1*wp2QADRlXYO6iT9MjbhK4Q.png" role="presentation" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@james.daniel.isaiah?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="l ff"><img alt="Daniel James" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*9Nad9XN_87Y1hTsLm6moGA.png" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@james.daniel.isaiah?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Daniel James</p></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/@james.daniel.isaiah/combine-with-uikit-taking-small-steps-forward-1522e1da5ebf?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div title=""><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">Combine with UIKit: Taking Small Steps Forward</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">Keeping up with the latest trends and best practices in iOS development can feel like an uphill battle. As new libraries, frameworks, and…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/@james.daniel.isaiah/combine-with-uikit-taking-small-steps-forward-1522e1da5ebf?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="ab q"><div class="ri ab"><div class="bl" aria-hidden="false"></div>·11 min read·2 days ago</div></div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F1522e1da5ebf&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2F%40james.daniel.isaiah%2Fcombine-with-uikit-taking-small-steps-forward-1522e1da5ebf&amp;user=Daniel+James&amp;userId=49a133e3c406&amp;source=-----1522e1da5ebf----0-----------------clap_footer----ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@james.daniel.isaiah/combine-with-uikit-taking-small-steps-forward-1522e1da5ebf?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/@siempay/news-the-evolution-of-facebooks-ios-app-architecture-e79572e2071f?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="vu vv vw vx vy"><img alt="" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/0*BYiX3svlvoPC6_W-" role="presentation" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@siempay?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="l ff"><img alt="Brahim Siempay" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*yI97eTk8wydeYpRq0b7jtw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@siempay?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Brahim Siempay</p></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/@siempay/news-the-evolution-of-facebooks-ios-app-architecture-e79572e2071f?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div title=""><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">FBiOS: The evolution of Facebook’s iOS app architecture</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">It has dozens of dynamically loaded libraries (dylibs), and so many classes that they can’t be loaded into Xcode at once.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/@siempay/news-the-evolution-of-facebooks-ios-app-architecture-e79572e2071f?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="ab q"><div class="ri ab"><div class="bl" aria-hidden="false"></div>·2 min read·Feb 6</div></div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2Fe79572e2071f&amp;operation=register&amp;redirect=https%3A%2F%2Fsiempay.medium.com%2Fnews-the-evolution-of-facebooks-ios-app-architecture-e79572e2071f&amp;user=Brahim+Siempay&amp;userId=ca25fe18bc5b&amp;source=-----e79572e2071f----1-----------------clap_footer----ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@siempay/news-the-evolution-of-facebooks-ios-app-architecture-e79572e2071f?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="vu vv vw vx vy"><img alt="" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/1*y0vJwEfN45barnQO9jiYew.jpeg" role="presentation" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@frank-andrade?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="l ff"><img alt="The PyCoach" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*veEX4-CiLz5jqUjwWfQo_Q.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@frank-andrade?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">The PyCoach</p></a></div></div></div><div class="wk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/artificial-corner?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Artificial Corner</p></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div title="You’re Using ChatGPT Wrong! Here’s How to Be Ahead of 99% of ChatGPT Users"><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">You’re Using ChatGPT Wrong! Here’s How to Be Ahead of 99% of ChatGPT Users</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">Master ChatGPT by learning prompt engineering.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="ab q"><div class="ri ab"><div class="bl" aria-hidden="false"></div>·7 min read·Mar 17</div></div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fartificial-corner%2F886a50dabc54&amp;operation=register&amp;redirect=https%3A%2F%2Fartificialcorner.com%2Fyoure-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54&amp;user=The+PyCoach&amp;userId=fb44e21903f3&amp;source=-----886a50dabc54----0-----------------clap_footer----ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----4ab33ea4072f----0---------------------ef4894dd_0252_4376_be93_89eab641df9a-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/@michalmalewicz/there-are-five-levels-of-ui-skill-62e0e7700855?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="vu vv vw vx vy"><img alt="UI Design Skill Levels for Dark Mode" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/1*faOr3jx-Czy7hd1p9YFBHQ.png" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@michalmalewicz?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="l ff"><img alt="Michal Malewicz" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*tDHV0f1BqZoVtDO8b-iu_A.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@michalmalewicz?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Michal Malewicz</p><div class="zj zk l"><div class="ab"><div class="ab"></div></div></div></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/@michalmalewicz/there-are-five-levels-of-ui-skill-62e0e7700855?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div title=""><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">There are FIVE levels of UI skill.</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">Only level 4+ gets you hired.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/@michalmalewicz/there-are-five-levels-of-ui-skill-62e0e7700855?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="ab q"><div class="ri ab"><div class="bl" aria-hidden="false"></div>·6 min read·Apr 25</div></div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F62e0e7700855&amp;operation=register&amp;redirect=https%3A%2F%2Fmichalmalewicz.medium.com%2Fthere-are-five-levels-of-ui-skill-62e0e7700855&amp;user=Michal+Malewicz&amp;userId=fde1eb3eb589&amp;source=-----62e0e7700855----1-----------------clap_footer----ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@michalmalewicz/there-are-five-levels-of-ui-skill-62e0e7700855?source=read_next_recirc-----4ab33ea4072f----1---------------------ef4894dd_0252_4376_be93_89eab641df9a-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/@stevenpcurtis/use-the-decorator-pattern-for-repository-caching-and-cache-invalidation-57ae58d6d87b?source=read_next_recirc-----4ab33ea4072f----2---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="vu vv vw vx vy"><img alt="" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/0*OFuMC7td-rRPxGfq" role="presentation" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@stevenpcurtis?source=read_next_recirc-----4ab33ea4072f----2---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="l ff"><img alt="Steven Curtis" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*ivmbg0Ef6mylufSXa9aLBw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@stevenpcurtis?source=read_next_recirc-----4ab33ea4072f----2---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Steven Curtis</p></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/@stevenpcurtis/use-the-decorator-pattern-for-repository-caching-and-cache-invalidation-57ae58d6d87b?source=read_next_recirc-----4ab33ea4072f----2---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div title="Use the Decorator Pattern for Repository Caching and Cache Invalidation"><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">Use the Decorator Pattern for Repository Caching and Cache Invalidation</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">In Swift, naturally</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/@stevenpcurtis/use-the-decorator-pattern-for-repository-caching-and-cache-invalidation-57ae58d6d87b?source=read_next_recirc-----4ab33ea4072f----2---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="ab q"><div class="ri ab"><div class="bl" aria-hidden="false"></div>·6 min read·Feb 9</div></div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fp%2F57ae58d6d87b&amp;operation=register&amp;redirect=https%3A%2F%2Fstevenpcurtis.medium.com%2Fuse-the-decorator-pattern-for-repository-caching-and-cache-invalidation-57ae58d6d87b&amp;user=Steven+Curtis&amp;userId=c109496e47e&amp;source=-----57ae58d6d87b----2-----------------clap_footer----ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@stevenpcurtis/use-the-decorator-pattern-for-repository-caching-and-cache-invalidation-57ae58d6d87b?source=read_next_recirc-----4ab33ea4072f----2---------------------ef4894dd_0252_4376_be93_89eab641df9a-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="dv"><div class="dv ri l"><div class="bg dv"><div class="dv l"><div class="dv vf vg vh vi vj vk vl vm vn vo vp vq vr"><div class="vs"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/gitconnected/why-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19?source=read_next_recirc-----4ab33ea4072f----3---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="vu vv vw vx vy"><img alt="" class="bg vz wa wb wc" src="https://miro.medium.com/v2/resize:fit:1358/1*FyaF0pPskcOtQ_MmEnBjZA.jpeg" role="presentation" /></div></a></div><div class="vt ab ca cn"><div class="wd we wf wg wh ab"><div class="qu l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@alexcancode?source=read_next_recirc-----4ab33ea4072f----3---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="l ff"><img alt="Alexander Nguyen" class="l fa bx wi wj cw" src="https://miro.medium.com/v2/resize:fill:40:40/1*cwYWYCjbeXNc_pAtTeq_Zg.jpeg" width="20" height="20" /></div></a></div></div></div><div class="wk l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" rel="noopener follow" href="https://medium.com/@alexcancode?source=read_next_recirc-----4ab33ea4072f----3---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><p class="be b du z jh ji jj jk jl jm jn jo bj">Alexander Nguyen</p></a></div></div></div><div class="wk l"><p class="be b du z dt">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar is ab q" href="https://medium.com/gitconnected?source=read_next_recirc-----4ab33ea4072f----3---------------------ef4894dd_0252_4376_be93_89eab641df9a-------" rel="noopener follow"><p class="be b du z jh ji jj jk jl jm jn jo bj">Level Up Coding</p></a></div></div></div></div><div class="wl wm wn wo wp wq wr ws wt wu l gj"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/gitconnected/why-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19?source=read_next_recirc-----4ab33ea4072f----3---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div title=""><h2 class="be gr oj ol wv ww om on op wx wy oq nc pz wz xa qa ng qc xb xc qd nk qf xd xe qg jh jj jk jm jo bj">Why I Keep Failing Candidates During Google Interviews…</h2></div><div class="xf l"><h3 class="be b iq z jh xg jj jk xh jm jo dt">They don’t meet the bar.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/gitconnected/why-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19?source=read_next_recirc-----4ab33ea4072f----3---------------------ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="ab q"><div class="ri ab"><div class="bl" aria-hidden="false"></div>·4 min read·Apr 13</div></div></a><div class="xi xj xk xl xm l"><div class="ab co"><div class="am xn xo xp xq xr xs xt xu xv xw ab q"><div class="ab q kx"><div class="pw-multi-vote-icon ff jg ky kz la"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fgitconnected%2Fdc8f865b2c19&amp;operation=register&amp;redirect=https%3A%2F%2Flevelup.gitconnected.com%2Fwhy-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19&amp;user=Alexander+Nguyen&amp;userId=a148fd75c2e9&amp;source=-----dc8f865b2c19----3-----------------clap_footer----ef4894dd_0252_4376_be93_89eab641df9a-------"><div class="lb ao fg lc ld le am lf lg lh la"></div></a></div><div class="pw-multi-vote-count l li lj lk ll lm ln lo"><p class="be b du z dt">--</p></div></div><div class="xx l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/gitconnected/why-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19?source=read_next_recirc-----4ab33ea4072f----3---------------------ef4894dd_0252_4376_be93_89eab641df9a-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xz ya"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article>]]></description>
      <link>https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f</link>
      <guid>https://medium.com/airbnb-engineering/flexible-continuous-integration-for-ios-4ab33ea4072f</guid>
      <pubDate>Wed, 10 May 2023 19:01:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey to Airbnb — Michael Kinoti]]></title>
      <description><![CDATA[<article><div class="l"><div class="l"><section><div><div class="ho hp hq hr hs"><div class="ab cm"><div class="fg fh fi fj fk fl fm bg"><div class=""><div class="iw ix iy iz ja"><div class="speechify-ignore ab fs"><div class="speechify-ignore bg l"><div class="jb jc jd je jf ab"><div><div class="ab jg"><a rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=post_page-----645d4c228d06--------------------------------"><div><div class="bl" aria-hidden="false"><div class="l jh ji bx jj jk"><div class="l dj"><img alt="Lauren Mackevich" class="l df bx gf gg ff" src="https://miro.medium.com/v2/resize:fill:88:88/0*-imhApAGWwgM89i1.jpg" width="44" height="44" /></div></div></div></div></a><a href="https://medium.com/airbnb-engineering?source=post_page-----645d4c228d06--------------------------------" rel="noopener follow"><div class="jo ab dj"><div><div class="bl" aria-hidden="false"><div class="l jp jq bx jj jr"><div class="l dj"><img alt="The Airbnb Tech Blog" class="l df bx bq js ff" src="https://miro.medium.com/v2/resize:fill:48:48/1*MlNQKg-sieBGW5prWoe9HQ.jpeg" width="24" height="24" /></div></div></div></div></div></a></div></div><div class="bm bg l"><div class="ab"><div><div class="jt ab q"><div class="ab q ju"><div class="ab q"><div><div class="bl" aria-hidden="false"><p class="be b jv jw bj"><a class="af ag ah ai aj ak al am an ao ap aq ar jx" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=post_page-----645d4c228d06--------------------------------">Lauren Mackevich</a></p></div></div></div>·<p class="be b jv jw dl"><a class="ka kb ah ai aj ak al am an ao ap aq ar ek kc kd" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2Fae9de0d76057&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-michael-kinoti-645d4c228d06&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057&amp;source=post_page-ae9de0d76057----645d4c228d06---------------------post_header-----------">Follow</a></p></div></div></div></div><div class="l ke"><div class="ab fq kf kg kh"><div class="ki kj ab"><div class="be b bf z dl ab kk">Published in<div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/airbnb-engineering?source=post_page-----645d4c228d06--------------------------------" rel="noopener follow"><p class="be b bf z km kn ko kp kq kr ks kt bj">The Airbnb Tech Blog</p></a></div></div></div><div class="h k">·</div></div><div class="ab ae">7 min read<div class="ku kv l" aria-hidden="true">·</div>Just now</div></div></div></div></div><div class="ab fs kw kx ky kz la lb lc ld le lf lg lh li lj lk ll"><div class="h k w es et q"><div class="mb l"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F645d4c228d06&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-michael-kinoti-645d4c228d06&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057&amp;source=-----645d4c228d06---------------------clap_footer-----------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div></div><div><div class="bl" aria-hidden="false"></div></div><div class="ab q lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma"><div class="h k"><div class="df sv fq"><div class="l ae"><div class="ab cm"><div class="sw sx sy sz ta oi fm bg"><div class="ab"><div class="bl bg" aria-hidden="false"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div><div class="bl" aria-hidden="false" aria-describedby="postFooterSocialMenu" aria-labelledby="postFooterSocialMenu"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div><figure class="ny nz oa ob oc od nv nw paragraph-image"><div role="button" tabindex="0" class="oe of dj og bg oh"><div class="nv nw nx"><picture></picture></div></div></figure><p id="38ae" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">Saying no to med school and following a dream all the way to Silicon Valley</p><p id="aa7c" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj"><em class="pi">Becoming a doctor and trying to make it as a DJ have both crossed </em><a class="af pj" href="https://www.linkedin.com/in/michael-kinoti-7a309215/" rel="noopener ugc nofollow" target="_blank"><em class="pi">Michael Kinoti’s</em></a><em class="pi"> mind at one time or another. Instead, we’re lucky to have Michael (who goes by Kinoti) as Airbnb’s Director of Engineering for the Marketing Technology team. He brings with him over 15 years of industry experience at Microsoft and Uber, as well as a global perspective from his childhood in Kenya. Kinoti is passionate about travel and having a large-scale social impact, qualities that align nicely with Airbnb’s mission and vision. Here’s Kinoti’s story in his own words.</em></p><h1 id="8e93" class="pk pl hv be pm pn po pp pq pr ps pt pu pv pw px py pz qa qb qc qd qe qf qg qh bj">Doctor, lawyer, or engineer?</h1><p id="8ec9" class="pw-post-body-paragraph ok ol hv om b on qi op oq or qj ot ou ov qk ox oy oz ql pb pc pd qm pf pg ph ho bj">Anybody who grew up in Kenya around when I did is probably aware that medicine, law, and engineering were the <em class="pi">only </em>options for an ambitious student. And at least in my family, while all three careers were highly regarded, nothing was quite as prized as becoming a doctor.</p><p id="a685" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">Needless to say, I made my parents very proud when I was accepted to medical school. This is the story of the more than 20 years since, during which I’ve learned so much and have found my niche — as a software engineering leader.</p><h1 id="2ef5" class="pk pl hv be pm pn po pp pq pr ps pt pu pv pw px py pz qa qb qc qd qe qf qg qh bj">Choosing my own adventure</h1><p id="e387" class="pw-post-body-paragraph ok ol hv om b on qi op oq or qj ot ou ov qk ox oy oz ql pb pc pd qm pf pg ph ho bj">It may help to describe how I arrived at that medical school acceptance in the first place, and how by that time, I had already started developing an interest in computers. As the child of an entrepreneur father and an engineer mother, I am grateful to have had two amazing role models from a young age. My mother’s grit has been particularly inspiring. She was the only female in her university cohort, and time and time again she has had to work harder than everybody else to prove herself. She became a leader at Kenya’s major telecom company and directly contributed to bringing the country high-speed Internet, giving my generation and ones after access to a world of information.</p><p id="e4ce" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">Always proponents of education, my parents sent me to one of the country’s premier institutions, the Starehe Boys’ Centre and School. The school’s mission is to develop youth into better human beings and leaders, with an emphasis on a holistic education beyond just academics. To this day, I live by the school’s values: integrity, leadership, and service.</p><p id="8695" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">I was lucky to be one of the first students in Kenya to take a computer studies course in school. While most students used the computer lab to play video games, myself included, I spent a lot of my time learning how to code. Seeing my deep interest in programming, my classmates would joke that I was going to be the next Bill Gates. I think those formative experiences instilled in me a dream to one day work at Microsoft, one of the biggest technology companies in the world. I wanted to make an impact through technology, which I saw and continue to see as an engine for leveling the global playing field. At the time, however, that was all just a dream and Redmond, Washington couldn’t have felt further away.</p><p id="0ae1" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">For a while, I put chasing that dream on hold as I applied to medical school and was later accepted. Everything was set for me to matriculate in six months’ time, and during that gap I took the opportunity to get some more hands-on coding experience. I realized that medicine wasn’t for me and that not only did I want to study software engineering, but I wanted to come to the United States to be as close as possible to Silicon Valley.</p><p id="4cb8" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">It took a lot of courage to admit this change of heart to my parents, and their initial reaction was a pragmatic one — it was too expensive, too far, and just not feasible to go abroad. Still, I was motivated to give it a try, so I took the SAT on my own and did everything else needed to apply to American colleges. I received some acceptance letters and, by showing my initiative, convinced my parents that my dream might be within reach after all.</p><h1 id="99e7" class="pk pl hv be pm pn po pp pq pr ps pt pu pv pw px py pz qa qb qc qd qe qf qg qh bj">Dreams do come true, but then what?</h1><p id="d775" class="pw-post-body-paragraph ok ol hv om b on qi op oq or qj ot ou ov qk ox oy oz ql pb pc pd qm pf pg ph ho bj">I remember being extremely excited leading up to my move to the US, until about halfway through the plane journey when it hit me that I was leaving my family and everything I knew behind. I had to overcome that fear and the imposter syndrome that came from doubting whether I picked the right path. To add to that, there’s a lot of culture shock that accompanies moving continents. For the first time in my life I was an ethnic minority and had to grapple with what that meant.</p><p id="b6ff" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">For me, the culture shock extended into the classroom, too. I enrolled at the Florida Institute of Technology to study software engineering. It was my first time seeing students challenging teachers and engaging in open discussions. Putting myself in a new environment exposed me to an entirely new outlook.</p><p id="1d54" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">The amount of effort I put into adjusting to my new home paid off more than I could have ever imagined. I used to be a kid in Kenya with a vague sense that coming to America was the right move for me. With a lot of hard work, I got a job at my dream company: Microsoft!</p><p id="8318" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">Microsoft had so much support and mentorship, along with growth and learning opportunities that kept me busy for 11 years. To an extent, I was still following the mindset I grew up with that values loyalty. The path was clear: I could have stayed at the company indefinitely, then gotten married, and soon after started a family. What I learned from all the friends I made over those years, however, is that Microsoft is just one of many companies doing amazing things. Once you get to a point in your career where you’re not growing the same way, you’re not learning the same way, or you just want a different challenge, <em class="pi">it’s okay to change</em>.</p><p id="6c74" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">I had focused on infrastructure during my time at Microsoft, and as much as I enjoyed it, I wanted to keep exploring. I joined Uber to lead the team building the company’s customer support platform. This is where I discovered my niche for building platforms at the sweet spot between product and infrastructure. I love being able to shape systems that directly affect millions of people and translate into features that people can see and feel.</p><h1 id="a05c" class="pk pl hv be pm pn po pp pq pr ps pt pu pv pw px py pz qa qb qc qd qe qf qg qh bj">Why I picked Airbnb</h1><p id="cc0f" class="pw-post-body-paragraph ok ol hv om b on qi op oq or qj ot ou ov qk ox oy oz ql pb pc pd qm pf pg ph ho bj">After a bit over three years at Uber, I made the switch to Airbnb, which felt right for so many reasons. Airbnb’s mission around building belonging and connection really resonated with me. The company has an ambitious vision, and I believe promoting belonging and connection are fundamental to solving so many other societal problems. This, in addition to travel being a passion of mine (I’ve been to 55 countries across 6 continents!) made me very excited about Airbnb.</p><p id="55cb" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">The way Airbnb works toward its mission is unique. We have a creative touch when it comes to technology, something our CEO, Brian, encourages a lot. We care deeply about the details and chasing perfection in a healthy way. Of course, there’s no such thing as actual perfection, but to strive for that the way we do at Airbnb produces great results, something our users see every time they interact with the product.</p><p id="1459" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">Arguably the biggest factor in my decision, though, was Airbnb’s culture. This can be hard to fully put into words; what I love about Airbnb’s culture comes through in all the little things you experience day to day. There’s a genuine warmness between people who seek to build belonging and connection everywhere they go. People are welcoming, particularly to new hires. Even the interview process at Airbnb feels more human and conversational, which is different from so many other companies. Culture starts with the details and adds up into bigger things too: I think Airbnb truly excels at work-life integration and as somebody who recently started a family, I’m very glad I came here.</p><h1 id="db68" class="pk pl hv be pm pn po pp pq pr ps pt pu pv pw px py pz qa qb qc qd qe qf qg qh bj">Integrity, leadership, and service at Airbnb</h1><p id="c6fb" class="pw-post-body-paragraph ok ol hv om b on qi op oq or qj ot ou ov qk ox oy oz ql pb pc pd qm pf pg ph ho bj">At Airbnb, I lead the Marketing Technology (Growth Platform) team responsible for Canvas, an internal platform that enables marketing and product teams to effectively engage with customers. Our overarching goal is to drive business growth and product engagement. Canvas has tools for creating, managing, and measuring content that gets published both to Airbnb and offsite channels such as emails, notifications, and ads. I’ve reaffirmed how much I enjoy my role as a platform owner since I get to be at the nexus of so many areas that are important to the overall business. I get to think about everything from notifications, to personalization using machine learning, to the underlying infrastructure powering it all.</p><p id="3e81" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">On a daily basis, I put into practice the three values I’ve lived by since my days in school: integrity, leadership, and service. My philosophy around leadership is that it’s not about power or being in charge. Rather, leadership is a form of service. I make a point to be empathetic, and my mission as a leader is to unlock the best in others through coaching and mentorship. My own mentors and coaches have played a large part in getting me to where I am, and I seek to pay that forward.</p><p id="bdc1" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">Of course, integrity also remains at the heart of every decision I make as a leader. Privacy and compliance are key focus areas for my team right now, which I enjoy because of the strong alignment those goals have with my value of integrity. To me, integrity means handling user data with the same care I’d want for my own data.</p><p id="6234" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">Currently, we’re also doing cutting-edge work on personalizing our marketing. Instead of blasting out the same email campaign to every user, we want to identify the journey a particular user is on and customize the content they see to be more relevant. Not only is this an interesting technical problem, it’s a nuanced issue of respecting user privacy while offering a more tailored experience.</p><p id="a5b1" class="pw-post-body-paragraph ok ol hv om b on oo op oq or os ot ou ov ow ox oy oz pa pb pc pd pe pf pg ph ho bj">In the last couple years, Airbnb has been undergoing an incredible transformation from a startup into a mature company. There’s a huge modernization effort within the company to scale our tech stack to match the scale at which we have a global impact. If that’s interesting to you, I encourage you to <a class="af pj" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">check out openings at Airbnb</a>. If there’s one thing I’ve learned from the wild ride I’ve had so far is that there’s no set path you need to follow — take a chance, and you’ll be amazed by what you might achieve.</p></div></div></div></div></div></div></div></div></section></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06?source=author_recirc-----645d4c228d06----0---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="ve vf vg cy dc"><img alt="My Journey to Airbnb — Veerabahu Chandran" class="bg vh vi vj vk" src="https://miro.medium.com/v2/resize:fit:0/1*wwf3CMkjhKPlaxichQJd1g.jpeg" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=author_recirc-----645d4c228d06----0---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="l dj"><img alt="Lauren Mackevich" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/0*-imhApAGWwgM89i1.jpg" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=author_recirc-----645d4c228d06----0---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">Lauren Mackevich</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----645d4c228d06----0---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06?source=author_recirc-----645d4c228d06----0---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">My Journey to Airbnb — Veerabahu Chandran</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">Learning and growing in Airbnb’s new Bangalore Tech Center</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06?source=author_recirc-----645d4c228d06----0---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="ab q">5 min read·Aug 18, 2022</div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F70468aa3bc06&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-veerabahu-chandran-70468aa3bc06&amp;user=Lauren+Mackevich&amp;userId=ae9de0d76057&amp;source=-----70468aa3bc06----0-----------------clap_footer----a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06?source=author_recirc-----645d4c228d06----0---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-human-in-the-loop-35b78a837725?source=author_recirc-----645d4c228d06----1---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="ve vf vg cy dc"><img alt="Building Airbnb Categories with ML &amp; Human in the Loop" class="bg vh vi vj vk" src="https://miro.medium.com/v2/resize:fit:0/1*QYv0Kr3gpdWJFtzPgqkwJA.jpeg" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@mihajlo.grbovic?source=author_recirc-----645d4c228d06----1---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="l dj"><img alt="Mihajlo Grbovic" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/1*hdBvFFL4w9pznHLwQUJjdw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@mihajlo.grbovic?source=author_recirc-----645d4c228d06----1---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">Mihajlo Grbovic</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----645d4c228d06----1---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-human-in-the-loop-35b78a837725?source=author_recirc-----645d4c228d06----1---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">Building Airbnb Categories with ML &amp; Human in the Loop</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">Airbnb Categories Blog Series — Part II : ML Categorization</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-human-in-the-loop-35b78a837725?source=author_recirc-----645d4c228d06----1---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="ab q">11 min read·Mar 22</div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F35b78a837725&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fbuilding-airbnb-categories-with-ml-human-in-the-loop-35b78a837725&amp;user=Mihajlo+Grbovic&amp;userId=a655fe1c6831&amp;source=-----35b78a837725----1-----------------clap_footer----a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-human-in-the-loop-35b78a837725?source=author_recirc-----645d4c228d06----1---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----645d4c228d06----2---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="ve vf vg cy dc"><img alt="A Deep Dive into Airbnb’s Server-Driven UI System" class="bg vh vi vj vk" src="https://miro.medium.com/v2/resize:fit:0/0*CedYKpSYMIGEiX7m" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rbro112?source=author_recirc-----645d4c228d06----2---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="l dj"><img alt="Ryan Brooks" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/1*py_8uAIKHqAuW89G5PgOeQ.png" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@rbro112?source=author_recirc-----645d4c228d06----2---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">Ryan Brooks</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----645d4c228d06----2---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----645d4c228d06----2---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">A Deep Dive into Airbnb’s Server-Driven UI System</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">How Airbnb ships features faster across web, iOS, and Android using a server-driven UI system named Ghost Platform ?.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----645d4c228d06----2---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="ab q">11 min read·Jun 29, 2021</div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2F842244c5f5&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fa-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5&amp;user=Ryan+Brooks&amp;userId=4c31895f4c38&amp;source=-----842244c5f5----2-----------------clap_footer----a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/a-deep-dive-into-airbnbs-server-driven-ui-system-842244c5f5?source=author_recirc-----645d4c228d06----2---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-and-human-in-the-loop-e97988e70ebb?source=author_recirc-----645d4c228d06----3---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="ve vf vg cy dc"><img alt="Building Airbnb Categories with ML and Human-in-the-Loop" class="bg vh vi vj vk" src="https://miro.medium.com/v2/resize:fit:0/1*RrtVCKycvPPwPuukCLDT-Q.jpeg" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@mihajlo.grbovic?source=author_recirc-----645d4c228d06----3---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="l dj"><img alt="Mihajlo Grbovic" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/1*hdBvFFL4w9pznHLwQUJjdw.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@mihajlo.grbovic?source=author_recirc-----645d4c228d06----3---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">Mihajlo Grbovic</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/airbnb-engineering?source=author_recirc-----645d4c228d06----3---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">The Airbnb Tech Blog</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-and-human-in-the-loop-e97988e70ebb?source=author_recirc-----645d4c228d06----3---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">Building Airbnb Categories with ML and Human-in-the-Loop</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">Airbnb Categories Blog Series — Part I</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-and-human-in-the-loop-e97988e70ebb?source=author_recirc-----645d4c228d06----3---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="ab q">9 min read·Nov 21, 2022</div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fairbnb-engineering%2Fe97988e70ebb&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fbuilding-airbnb-categories-with-ml-and-human-in-the-loop-e97988e70ebb&amp;user=Mihajlo+Grbovic&amp;userId=a655fe1c6831&amp;source=-----e97988e70ebb----3-----------------clap_footer----a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-and-human-in-the-loop-e97988e70ebb?source=author_recirc-----645d4c228d06----3---------------------a3abb1e0_8fd7_4177_8b58_5b5fdfaf1b14-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/gitconnected/why-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ve vf vg cy dc"><img alt="Why I Keep Failing Candidates During Google Interviews…" class="bg vh vi vj vk" src="https://miro.medium.com/v2/resize:fit:0/1*FyaF0pPskcOtQ_MmEnBjZA.jpeg" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@alexcancode?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="l dj"><img alt="Alexander Nguyen" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/1*cwYWYCjbeXNc_pAtTeq_Zg.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@alexcancode?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">Alexander Nguyen</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/gitconnected?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">Level Up Coding</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/gitconnected/why-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">Why I Keep Failing Candidates During Google Interviews…</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">They don’t meet the bar.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/gitconnected/why-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ab q"><div class="ra ab"><div class="bl" aria-hidden="false"></div>·4 min read·Apr 13</div></div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fgitconnected%2Fdc8f865b2c19&amp;operation=register&amp;redirect=https%3A%2F%2Flevelup.gitconnected.com%2Fwhy-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19&amp;user=Alexander+Nguyen&amp;userId=a148fd75c2e9&amp;source=-----dc8f865b2c19----0-----------------clap_footer----1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/gitconnected/why-i-keep-failing-candidates-during-google-interviews-dc8f865b2c19?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/better-programming/how-to-build-credibility-at-work-95a7a978ed7f?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ve vf vg cy dc"><img alt="How to Build Credibility at Work" class="bg vh vi vj yi" src="https://miro.medium.com/v2/resize:fit:0/1*Xte8cy7HQlX88jpEqsQUnw.png" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@vinitabansal?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="l dj"><img alt="Vinita" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/1*QTJ9mUU_MhcxFz_lt0mwVw.png" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@vinitabansal?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">Vinita</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/better-programming?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">Better Programming</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/better-programming/how-to-build-credibility-at-work-95a7a978ed7f?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">How to Build Credibility at Work</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">Building credibility requires more than just competence and knowledge.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/better-programming/how-to-build-credibility-at-work-95a7a978ed7f?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ab q"><div class="ra ab"><div class="bl" aria-hidden="false"></div>·8 min read·Apr 3</div></div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fbetter-programming%2F95a7a978ed7f&amp;operation=register&amp;redirect=https%3A%2F%2Fbetterprogramming.pub%2Fhow-to-build-credibility-at-work-95a7a978ed7f&amp;user=Vinita&amp;userId=b892e7626234&amp;source=-----95a7a978ed7f----1-----------------clap_footer----1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/better-programming/how-to-build-credibility-at-work-95a7a978ed7f?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/the-generator/how-i-built-writinggpt-a-fully-automated-ai-writing-team-a8fdf0255586?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ve vf vg cy dc"><img alt="How I Built WritingGPT, a Fully Automated AI Writing Team" class="bg vh vi vj vk" src="https://miro.medium.com/v2/resize:fit:0/1*d7LyQsS_7eAkQVRvuJvCZQ.png" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@tomsmith585?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="l dj"><img alt="Thomas Smith" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/2*3vJU4sgGd_CmFQFl4wH6-Q.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@tomsmith585?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">Thomas Smith</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/the-generator?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">The Generator</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/the-generator/how-i-built-writinggpt-a-fully-automated-ai-writing-team-a8fdf0255586?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">How I Built WritingGPT, a Fully Automated AI Writing Team</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">It writes articles that rank on Google for about $1 each</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/the-generator/how-i-built-writinggpt-a-fully-automated-ai-writing-team-a8fdf0255586?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ab q"><div class="ra ab"><div class="bl" aria-hidden="false"></div>·14 min read·Apr 18</div></div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fthe-generator%2Fa8fdf0255586&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fthe-generator%2Fhow-i-built-writinggpt-a-fully-automated-ai-writing-team-a8fdf0255586&amp;user=Thomas+Smith&amp;userId=d00bc5bb7954&amp;source=-----a8fdf0255586----0-----------------clap_footer----1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/the-generator/how-i-built-writinggpt-a-fully-automated-ai-writing-team-a8fdf0255586?source=read_next_recirc-----645d4c228d06----0---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/in-fitness-and-in-health/looking-better-than-99-of-people-over-40-is-about-1-thing-ddcd39936cf7?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ve vf vg cy dc"><img alt="Looking Better Than 99% of People Over 40 is About One Thing" class="bg vh vi vj vk" src="https://miro.medium.com/v2/resize:fit:0/1*I2tD0Pxd-v2VHfsOSD_rQg.png" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@offacoach?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="l dj"><img alt="Chris Davidson" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/1*7B3PdiW_B7GmSgD0PqJEdA.png" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@offacoach?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">Chris Davidson</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/in-fitness-and-in-health?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">In Fitness And In Health</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/in-fitness-and-in-health/looking-better-than-99-of-people-over-40-is-about-1-thing-ddcd39936cf7?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">Looking Better Than 99% of People Over 40 is About One Thing</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">Not workouts, diet, supplements or any ‘hack’. Once you nail this there’s no stopping you.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/in-fitness-and-in-health/looking-better-than-99-of-people-over-40-is-about-1-thing-ddcd39936cf7?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ab q"><div class="ra ab"><div class="bl" aria-hidden="false"></div>·5 min read·Apr 17</div></div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fin-fitness-and-in-health%2Fddcd39936cf7&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fin-fitness-and-in-health%2Flooking-better-than-99-of-people-over-40-is-about-1-thing-ddcd39936cf7&amp;user=Chris+Davidson&amp;userId=849e2ebc72a8&amp;source=-----ddcd39936cf7----1-----------------clap_footer----1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/in-fitness-and-in-health/looking-better-than-99-of-people-over-40-is-about-1-thing-ddcd39936cf7?source=read_next_recirc-----645d4c228d06----1---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/an-injustice/do-people-really-not-want-to-work-anymore-cf8cb6f37249?source=read_next_recirc-----645d4c228d06----2---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ve vf vg cy dc"><img alt="Do People Really Not Want To Work Anymore?" class="bg vh vi vj vk" src="https://miro.medium.com/v2/resize:fit:0/1*wLaFlFTkWigjdjWWLdjBOQ@2x.jpeg" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@rosalynmorris?source=read_next_recirc-----645d4c228d06----2---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="l dj"><img alt="Rosalyn Morris" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/1*Sh_5XYPlGuzp8kL4Rn6O9A@2x.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@rosalynmorris?source=read_next_recirc-----645d4c228d06----2---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">Rosalyn Morris</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/an-injustice?source=read_next_recirc-----645d4c228d06----2---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">An Injustice!</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/an-injustice/do-people-really-not-want-to-work-anymore-cf8cb6f37249?source=read_next_recirc-----645d4c228d06----2---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">Do People Really Not Want To Work Anymore?</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">Older generations keep saying that people just don’t want to work anymore. By older generations I mean “Baby Boomers” and “Gen X.” Heck…</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/an-injustice/do-people-really-not-want-to-work-anymore-cf8cb6f37249?source=read_next_recirc-----645d4c228d06----2---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ab q"><div class="ra ab"><div class="bl" aria-hidden="false"></div>·4 min read·6 days ago</div></div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fan-injustice%2Fcf8cb6f37249&amp;operation=register&amp;redirect=https%3A%2F%2Faninjusticemag.com%2Fdo-people-really-not-want-to-work-anymore-cf8cb6f37249&amp;user=Rosalyn+Morris&amp;userId=66289c111f63&amp;source=-----cf8cb6f37249----2-----------------clap_footer----1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/an-injustice/do-people-really-not-want-to-work-anymore-cf8cb6f37249?source=read_next_recirc-----645d4c228d06----2---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div><div class="j i d"></div></div></div></div></div></div></div></article><article class="di"><div class="di ra l"><div class="bg di"><div class="di l"><div class="di up uq ur us ut uu uv uw ux uy uz va vb"><div class="vc"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Image" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----645d4c228d06----3---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ve vf vg cy dc"><img alt="You’re Using ChatGPT Wrong! Here’s How to Be Ahead of 99% of ChatGPT Users" class="bg vh vi vj vk" src="https://miro.medium.com/v2/resize:fit:0/1*y0vJwEfN45barnQO9jiYew.jpeg" /></div></a></div><div class="vd ab cm fr"><div class="vl vm vn vo vp ab"><div class="qq l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/@frank-andrade?source=read_next_recirc-----645d4c228d06----3---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="l dj"><img alt="The PyCoach" class="l df bx vq vr ff" src="https://miro.medium.com/v2/resize:fill:40:40/1*veEX4-CiLz5jqUjwWfQo_Q.jpeg" width="20" height="20" /></div></a></div></div></div><div class="vs l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" rel="noopener follow" href="https://medium.com/@frank-andrade?source=read_next_recirc-----645d4c228d06----3---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><p class="be b dm z km kn ko kp kq kr ks kt bj">The PyCoach</p></a></div></div></div><div class="vs l"><p class="be b dm z dl">in</p></div><div class="l"><div><div class="bl" aria-hidden="false"><a class="af ag ah ai aj ak al am an ao ap aq ar jx ab q" href="https://medium.com/artificial-corner?source=read_next_recirc-----645d4c228d06----3---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------" rel="noopener follow"><p class="be b dm z km kn ko kp kq kr ks kt bj">Artificial Corner</p></a></div></div></div></div><div class="vt vu vv vw vx vy vz wa wb wc l ho"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview Title" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----645d4c228d06----3---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><h2 class="be hw pn pp wd we pq pr pt wf wg pu ov wh wi wj wk oz wl wm wn wo pd wp wq wr ws km ko kp kr kt bj">You’re Using ChatGPT Wrong! Here’s How to Be Ahead of 99% of ChatGPT Users</h2><div class="wt l"><h3 class="be b jv z km wu ko kp wv kr kt dl">Master ChatGPT by learning prompt engineering.</h3></div></a></div><a class="af ag ah ai aj ak al am an ao ap aq ar as at" aria-label="Post Preview memebership and reading time information" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----645d4c228d06----3---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="ab q"><div class="ra ab"><div class="bl" aria-hidden="false"></div>·7 min read·Mar 17</div></div></a><div class="ww wx wy wz xa l"><div class="ab fs"><div class="am xb xc xd xe xf xg xh xi xj xk ab q"><div class="ab q cc"><div class="pw-multi-vote-icon dj kl mc md me"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fvote%2Fartificial-corner%2F886a50dabc54&amp;operation=register&amp;redirect=https%3A%2F%2Fartificialcorner.com%2Fyoure-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54&amp;user=The+PyCoach&amp;userId=fb44e21903f3&amp;source=-----886a50dabc54----3-----------------clap_footer----1d12e075_fdb9_42eb_8504_9379f78d4b24-------"><div class="mf ao ev mg mh mi am mj mk ml me"></div></a></div><div class="pw-multi-vote-count l mm mn mo mp mq mr ms"><p class="be b dm z dl">--</p></div></div><div class="xl l"><a class="af ag ah ai aj ak al am an ao ap aq ar as at" rel="noopener follow" href="https://medium.com/artificial-corner/youre-using-chatgpt-wrong-here-s-how-to-be-ahead-of-99-of-chatgpt-users-886a50dabc54?source=read_next_recirc-----645d4c228d06----3---------------------1d12e075_fdb9_42eb_8504_9379f78d4b24-------&amp;responsesOpen=true&amp;sortBy=REVERSE_CHRON"><div><div class="bl" aria-hidden="false"></div></div></a></div></div><div class="ab q xn xo"><div><div class="bl" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></article>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-michael-kinoti-645d4c228d06</guid>
      <pubDate>Wed, 26 Apr 2023 22:26:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Improving Istio Propagation Delay]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gk gl gm gn go gp gq gr gs gt l"><div class="ab gu gv"><div class="ab"><div class="fm l"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@yingzhuivy?source=post_page-----d4da9b5b9f90--------------------------------"><div class="l di"><img alt="Ying Zhu" class="l de bw gw gx fe" src="https://miro.medium.com/v2/resize:fill:96:96/1*Ytzu0lxmgezO_P2CpALIVw.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bd b gy gz bi"><div class="ha ab q"><div><div class="bk" aria-hidden="false"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@yingzhuivy?source=post_page-----d4da9b5b9f90--------------------------------"><div class="ab q">Ying Zhu</div></a></div></div><div class="hb hc hd he i d"></div></div><div class="ab q hg"><p class="pw-published-date bd b be z dk">Mar 23</p><div class="hh bk" aria-hidden="true">·</div><div class="pw-reading-time bd b be z dk">8 min read</div></div></div></div><div class="ab q"><div class="h k hi hj hk"><div class="hl l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hl l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hl l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hp ab q"></div></div></div><div class="hu s u j i d"><div class="fm l"><div class="hz l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hz l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hz l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="bl l"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ij ik il im in"><div class=""><div class=""><h2 id="7913" class="pw-subtitle-paragraph jn ip iq bd b jo jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke dk">A case study in service mesh performance optimization</h2></div><p id="6276" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">by: <a class="ae lb" href="https://www.linkedin.com/in/ying-zhu-763a3879/" rel="noopener ugc nofollow" target="_blank">Ying Zhu</a></p><figure class="ld le lf lg gt lh gh gi paragraph-image"><div role="button" tabindex="0" class="li lj di lk bf ll"><div class="gh gi lc"><picture></picture></div></div></figure><h1 id="85a9" class="lo lp iq bd lq lr ls lt lu lv lw lx ly jw lz jx ma jz mb ka mc kc md kd me mf bi">Introduction</h1><p id="7c18" class="pw-post-body-paragraph kf kg iq kh b ki mg jr kk kl mh ju kn ko mi kq kr ks mj ku kv kw mk ky kz la ij bi">In this article, we’ll showcase how we identified and addressed a service mesh performance problem at Airbnb, providing insights into the process of troubleshooting service mesh issues.</p><h2 id="3a80" class="ml lp iq bd lq mm mn dn lu mo mp dp ly ko mq mr ma ks ms mt mc kw mu mv me mw bi">Background</h2><p id="29b8" class="pw-post-body-paragraph kf kg iq kh b ki mg jr kk kl mh ju kn ko mi kq kr ks mj ku kv kw mk ky kz la ij bi">At Airbnb, we use a microservices architecture, which requires efficient communication between services. Initially, we developed a homegrown service discovery system called Smartstack exactly for this purpose. As the company grew, however, we encountered scalability issues¹. To address this, in 2019, we invested in a modern service mesh solution called AirMesh, built on the open-source <a class="ae lb" href="https://istio.io/latest/" rel="noopener ugc nofollow" target="_blank">Istio</a> software. Currently, over 90% of our production traffic has been migrated to AirMesh, with plans to complete the migration by 2023.</p><h2 id="0d69" class="ml lp iq bd lq mm mn dn lu mo mp dp ly ko mq mr ma ks ms mt mc kw mu mv me mw bi">The Symptom: Increased Propagation Delay</h2><p id="5abf" class="pw-post-body-paragraph kf kg iq kh b ki mg jr kk kl mh ju kn ko mi kq kr ks mj ku kv kw mk ky kz la ij bi">After we upgraded Istio from 1.11 to 1.12, we noticed a puzzling increase in the propagation delay — the time between when the Istio control plane gets notified of a change event and when the change is processed and pushed to a workload. This delay is important for our service owners because they depend on it to make critical routing decisions. For example, servers need to have a graceful shutdown period longer than the propagation delay, otherwise clients can send requests to already-shut-down server workloads and get 503 errors.</p><h2 id="b840" class="ml lp iq bd lq mm mn dn lu mo mp dp ly ko mq mr ma ks ms mt mc kw mu mv me mw bi">Data Gathering: Propagation Delay Metrics</h2><p id="855a" class="pw-post-body-paragraph kf kg iq kh b ki mg jr kk kl mh ju kn ko mi kq kr ks mj ku kv kw mk ky kz la ij bi">Here’s how we discovered the condition: we had been monitoring the Istio metric <em class="mx">pilot_proxy_convergence_time</em> for propagation delay when we noticed an increase from 1.5 seconds (p90 in Istio 1.11) to 4.5 seconds (p90 in Istio 1.12). <em class="mx">Pilot_proxy_convergence_time</em> is one of several metrics Istio records for propagation delay. The complete list of metrics is:</p><ul class=""><li id="e53c" class="my mz iq kh b ki kj kl km ko na ks nb kw nc la nd ne nf ng bi"><em class="mx">pilot_proxy_convergence_time</em> — measures the time from when a push request is added to the push queue to when it is processed and pushed to a workload proxy. (Note that change events are converted into push requests and are batched through a process called <em class="mx">debounce</em> before being added to the queue, which we will go into details later.)</li><li id="75c6" class="my mz iq kh b ki nh kl ni ko nj ks nk kw nl la nd ne nf ng bi"><em class="mx">pilot_proxy_queue_time</em> — measures the time between a push request enqueue and dequeue.</li><li id="a323" class="my mz iq kh b ki nh kl ni ko nj ks nk kw nl la nd ne nf ng bi"><em class="mx">pilot_xds_push_time</em> — measures the time for building and sending the xDS resources. Istio leverages Envoy as its data plane. Istiod, the control plane of Istio, configures Envoy through the xDS API (where x can be viewed as a variable, and DS stands for discovery service).</li><li id="ee42" class="my mz iq kh b ki nh kl ni ko nj ks nk kw nl la nd ne nf ng bi"><em class="mx">pilot_xds_send_time</em> — measures the time for actually sending the xDS resources.</li></ul><p id="c2ae" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">The diagram below shows how each of these metrics maps to the life of a push request.</p><figure class="ld le lf lg gt lh gh gi paragraph-image"><div role="button" tabindex="0" class="li lj di lk bf ll"><div class="gh gi nm"><picture></picture></div></div><figcaption class="nn no gj gh gi np nq bd b be z dk">A high level graph to help understand the metrics related to propagation delay.</figcaption></figure><h1 id="921d" class="lo lp iq bd lq lr ls lt lu lv lw lx ly jw lz jx ma jz mb ka mc kc md kd me mf bi">Investigation</h1><h2 id="26bc" class="ml lp iq bd lq mm mn dn lu mo mp dp ly ko mq mr ma ks ms mt mc kw mu mv me mw bi">xDS Lock Contention</h2><p id="98e9" class="pw-post-body-paragraph kf kg iq kh b ki mg jr kk kl mh ju kn ko mi kq kr ks mj ku kv kw mk ky kz la ij bi">CPU profiling showed no noticeable changes between 1.11 and 1.12, but handling push requests took longer, indicating time was spent on some waiting events. This led to the suspicion of lock contention issues.</p><p id="de7f" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">Istio uses four types of xDS resources to configure Envoy:</p><ul class=""><li id="1451" class="my mz iq kh b ki kj kl km ko na ks nb kw nc la nd ne nf ng bi">Endpoint Discovery Service (EDS) — describes how to discover members of an upstream cluster.</li><li id="e737" class="my mz iq kh b ki nh kl ni ko nj ks nk kw nl la nd ne nf ng bi">Cluster Discovery Service (CDS) — describes how to discover upstream clusters used during routing.</li><li id="9022" class="my mz iq kh b ki nh kl ni ko nj ks nk kw nl la nd ne nf ng bi">Route Discovery Service (RDS) –describes how to discover the route configuration for an HTTP connection manager filter at runtime.</li><li id="cd9b" class="my mz iq kh b ki nh kl ni ko nj ks nk kw nl la nd ne nf ng bi">Listener Discovery Service (LDS) –describes how to discover the listeners at runtime.</li></ul><p id="c80a" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">Analysis of the metric <em class="mx">pilot_xds_push_time</em> showed that only three types of pushes (EDS, CDS, RDS) increased after the upgrade to 1.12. The Istio changelog revealed that <a class="ae lb" href="https://github.com/istio/istio/pull/33338" rel="noopener ugc nofollow" target="_blank">CDS</a> and<a class="ae lb" href="https://github.com/istio/istio/pull/34243" rel="noopener ugc nofollow" target="_blank"> RDS</a> caching was added in 1.12.</p><p id="a1b4" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">To verify that these changes were indeed the culprits, we tried turning off the caches by setting PILOT_ENABLE_CDS_CACHE and PILOT_ENABLE_RDS_CACHE to “False”. When we did this, <em class="mx">pilot_xds_push_time</em> for CDS reverted back to the 1.11 level, but not RDS or EDS. This improved the <em class="mx">pilot_proxy_convergence_time</em>, but not enough to return it to the previous level. We believed that there was something else affecting the results.</p><p id="4fa0" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">Further investigation into the xDS cache revealed that all xDS computations shared one cache. The tricky thing is that Istio used an LRU Cache under the hood. The cache is locked not only on <a class="ae lb" href="https://github.com/istio/istio/blob/1.12.9/pilot/pkg/model/xds_cache.go#L229" rel="noopener ugc nofollow" target="_blank">write</a>s, but also on <a class="ae lb" href="https://github.com/istio/istio/blob/1.12.9/pilot/pkg/model/xds_cache.go#L266" rel="noopener ugc nofollow" target="_blank">read</a>s, because when you read from the cache, you need to promote the item to most recently used. This caused lock contention and slow processing due to multiple threads trying to access the same lock at the same time.</p><p id="3c84" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">The hypothesis formed was that xDS cache lock contention caused slowdowns for CDS and RDS because caching was turned on for those two resources, and also impacted EDS due to the shared cache, but not LDS as it did not have caching implemented.</p><p id="0bb6" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">But why turning off both CDS and RDS cache does not solve the problem? By looking at where the cache was used when building RDS, we found out that the flag PILOT_ENABLE_RDS_CACHE was not respected. We fixed that <a class="ae lb" href="https://github.com/istio/istio/pull/40719" rel="noopener ugc nofollow" target="_blank">bug</a> and conducted performance testing in our test mesh to verify our hypothesis with the following setup:</p><ul class=""><li id="0342" class="my mz iq kh b ki kj kl km ko na ks nb kw nc la nd ne nf ng bi">Control plane:- 1 Istiod pod (memory 26 G, cpu 10 cores)</li><li id="378b" class="my mz iq kh b ki nh kl ni ko nj ks nk kw nl la nd ne nf ng bi">Data plane:- 50 services and 500 pods- We mimicked changes by restarting deployments randomly every 10 seconds and changing virtual service routings randomly every 5 seconds</li></ul><p id="06f9" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">Here were the results:</p><figure class="ld le lf lg gt lh gh gi paragraph-image"><div role="button" tabindex="0" class="li lj di lk bf ll"><div class="gh gi nr"><picture></picture></div></div><figcaption class="nn no gj gh gi np nq bd b be z dk">A table of results² for the perfomance testing.</figcaption></figure><p id="d5ff" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">Because our Istiod pods were not CPU intensive, we decided to disable the CDS and RDS caches for the moment. As a result, propagation delays returned to the previous level. Here is the Istio <a class="ae lb" href="https://github.com/istio/istio/issues/40744" rel="noopener ugc nofollow" target="_blank">issue</a> for this problem and potential future improvement of the xDS cache.</p><h2 id="dca0" class="ml lp iq bd lq mm mn dn lu mo mp dp ly ko mq mr ma ks ms mt mc kw mu mv me mw bi">Debounce</h2><p id="f2ca" class="pw-post-body-paragraph kf kg iq kh b ki mg jr kk kl mh ju kn ko mi kq kr ks mj ku kv kw mk ky kz la ij bi">Here’s a twist in our diagnosis: during the deep dive of Istio code base, we realized that <em class="mx">pilot_proxy_convergence_time</em> does not actually fully capture propagation delay. We observed in our production that 503 errors happen during server deployment even when we set graceful shutdown time longer than <em class="mx">pilot_proxy_convergence_time</em>. This metric does not accurately reflect what we want it to reflect and we need to redefine it. Let’s revisit our network diagram, zoomed out to include the debounce process to capture the full life of a change event.</p><figure class="ld le lf lg gt lh gh gi paragraph-image"><div role="button" tabindex="0" class="li lj di lk bf ll"><div class="gh gi ns"><picture></picture></div></div><figcaption class="nn no gj gh gi np nq bd b be z dk"><em class="nt">A high level diagram of the life of a change event.</em></figcaption></figure><p id="ef5d" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">The process starts when a change notifies an Istiod controller³. This triggers a push which is sent to the push channel. Istiod then groups these changes together into one combined push request through a process called debouncing. Next, Istiod calculates the push context which contains all the necessary information for generating xDS. The push request together with the context are then added to the push queue. Here’s the problem: <em class="mx">pilot_proxy_convergence_time</em> only measures the time from when the combined push is added to the push queue, to when a proxy receives the calculated xDS.</p><p id="d8bf" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">From Istiod logs we found out that the debounce time was almost 110 seconds, even though we set PILOT_DEBOUNCE_MAX to 30 seconds. From reading the code, we realized that the <a class="ae lb" href="https://github.com/istio/istio/blob/1.15.3/pilot/pkg/xds/discovery.go#L532" rel="noopener ugc nofollow" target="_blank">initPushContext</a> step was blocking the next debounce to ensure that older changes are processed first.</p><p id="bbe3" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">To debug and test changes, we needed a testing environment. However, it was difficult to generate the same load on our test environment. Fortunately, the debounce and init push context are not affected by the number of Istio proxies. We set up a development box in production with no connected proxies and ran custom images to triage and test out fixes.</p><p id="0ed7" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">We performed CPU profiling and took a closer look into functions that were taking a long time:</p><figure class="ld le lf lg gt lh gh gi paragraph-image"><div role="button" tabindex="0" class="li lj di lk bf ll"><div class="gh gi nu"><picture></picture></div></div><figcaption class="nn no gj gh gi np nq bd b be z dk"><em class="nt">A CPU profile of Istiod.</em></figcaption></figure><p id="44ff" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">A significant amount of time was spent on the Service DeepCopy function. This was due to the use of the <a class="ae lb" href="https://github.com/mitchellh/copystructure" rel="noopener ugc nofollow" target="_blank">copystructure</a> library that used <a class="ae lb" href="https://go.dev/blog/laws-of-reflection" rel="noopener ugc nofollow" target="_blank">go reflection</a> to do deep copy, which has expensive performance. Removing the library⁴ was both easy and very effective at reducing our debounce time from 110 seconds to 50 seconds.</p><figure class="ld le lf lg gt lh gh gi paragraph-image"><div role="button" tabindex="0" class="li lj di lk bf ll"><div class="gh gi nv"><picture></picture></div></div><figcaption class="nn no gj gh gi np nq bd b be z dk"><em class="nt">A CPU profile of Istiod after DeepCopy improvement.</em></figcaption></figure><p id="5505" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">After the DeepCopy improvement, the next big chunk from the cpu profile was the ConvertToSidecarScope function. This function took a long time to determine which virtual services were imported by each Istio proxy. For each proxy egress host, Istiod first computed all the virtual services exported to the proxy’s namespace, then selected the virtual services by matching proxy egress host name to the virtual services’ hosts.</p><p id="2fec" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">All our virtual services were public as we did not specify the <em class="mx">exportTo</em> parameter, which is a list of namespaces to which this virtual service is exported. If this parameter is not configured, the virtual service is automatically exported to all namespaces. Therefore, <a class="ae lb" href="https://github.com/istio/istio/blob/1.12.9/pilot/pkg/model/push_context.go#L829-L833" rel="noopener ugc nofollow" target="_blank">VirtualServicesForGateway</a> function created and copied all virtual services each time. This deep-copy of slice elements was very expensive when we had many proxies with multiple egress hosts.</p><p id="12cd" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">We <a class="ae lb" href="https://github.com/istio/istio/pull/41101" rel="noopener ugc nofollow" target="_blank">reduced</a> the unnecessary copy of virtual services: instead of passing a copied version of the virtual services, we passed the virtualServiceIndex directly into the select function, further reducing the debounce time from 50 seconds to around 30 seconds.</p><p id="44ce" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">Another improvement that we are currently rolling out is to limit where virtual services are exported by setting the exportTo field, based on which clients are allowed to access the services. This should reduce debounce time by about 10 seconds.</p><p id="01e8" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">The Istio community is also actively working on improving the push context calculation. Some ideas include <a class="ae lb" href="https://github.com/istio/istio/issues/41453" rel="noopener ugc nofollow" target="_blank">adding multiple workers to compute the sidecar scope</a>, <a class="ae lb" href="https://github.com/istio/istio/pull/41647" rel="noopener ugc nofollow" target="_blank">processing changed sidecars only instead of rebuilding the entire sidecar scope</a>. We also added <a class="ae lb" href="https://github.com/istio/istio/pull/40523" rel="noopener ugc nofollow" target="_blank">metrics for the debounce time</a> so that we can monitor this together with the proxy convergence time to track accurate propagation delay.</p><h1 id="1487" class="lo lp iq bd lq lr ls lt lu lv lw lx ly jw lz jx ma jz mb ka mc kc md kd me mf bi">Conclusion</h1><p id="ba4e" class="pw-post-body-paragraph kf kg iq kh b ki mg jr kk kl mh ju kn ko mi kq kr ks mj ku kv kw mk ky kz la ij bi">To conclude our diagnosis, we learned that:</p><ul class=""><li id="d337" class="my mz iq kh b ki kj kl km ko na ks nb kw nc la nd ne nf ng bi">We should use both <em class="mx">pilot_debounce_time</em> and <em class="mx">pilot_proxy_convergence_time</em> to track propagation delay.</li><li id="2ef4" class="my mz iq kh b ki nh kl ni ko nj ks nk kw nl la nd ne nf ng bi">xDS cache can help with CPU usage but can impact propagation delay due to lock contention, tune PILOT_ENABLE_CDS_CACHE &amp; PILOT_ENABLE_RDS_CACHE to see what’s best for your system.</li><li id="63e7" class="my mz iq kh b ki nh kl ni ko nj ks nk kw nl la nd ne nf ng bi">Restrict the visibility of your Istio manifests by setting the <em class="mx">exportTo</em> field.</li></ul><p id="119e" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">If this type of work interests you, check out some of our related <a class="ae lb" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">roles</a>!</p><h1 id="8117" class="lo lp iq bd lq lr ls lt lu lv lw lx ly jw lz jx ma jz mb ka mc kc md kd me mf bi">Acknowledgments</h1><p id="cfd6" class="pw-post-body-paragraph kf kg iq kh b ki mg jr kk kl mh ju kn ko mi kq kr ks mj ku kv kw mk ky kz la ij bi">Thanks to the Istio community for creating a great open source project and for collaborating with us to make it even better. Also call out to the whole AirMesh team for building, maintaining and improving the service mesh layer at Airbnb. Thanks to Lauren Mackevich, Mark Giangreco and Surashree Kulkarni for editing the post.</p></div><div class="ab cl nw nx hu ny" role="separator"><div class="ij ik il im in"><p id="3053" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">[1]: Checkout our presentation <a class="ae lb" href="https://events.istio.io/istiocon-2021/sessions/airbnb-on-istio/" rel="noopener ugc nofollow" target="_blank">Airbnb on Istio</a> for details.</p><p id="b9d2" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">[2]: Note that some CPU throttling occurred for the last two cases, so if we were to allocate more CPU we would expect propagation delay (especially P99) to improve further.</p><p id="2cdc" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">[3]: Istiod service controller monitors changes to registered services from different sources including kubernetes service, ServiceEntry created service, etc., Istiod config controller monitors changes to the Istio resources used to manage those services.</p><p id="7901" class="pw-post-body-paragraph kf kg iq kh b ki kj jr kk kl km ju kn ko kp kq kr ks kt ku kv kw kx ky kz la ij bi">[4]: <a class="ae lb" href="https://github.com/istio/istio/pull/40966/files" rel="noopener ugc nofollow" target="_blank">PR1</a>, <a class="ae lb" href="https://github.com/istio/istio/pull/37932/files" rel="noopener ugc nofollow" target="_blank">PR2</a></p></div></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/improving-istio-propagation-delay-d4da9b5b9f90</link>
      <guid>https://medium.com/airbnb-engineering/improving-istio-propagation-delay-d4da9b5b9f90</guid>
      <pubDate>Thu, 23 Mar 2023 19:20:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Building Airbnb Categories with ML & Human in the Loop]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gk gl gm gn go gp gq gr gs gt l"><div class="ab gu gv"><div class="ab"><div class="fm l"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@mihajlo.grbovic?source=post_page-----35b78a837725--------------------------------"><div class="l di"><img alt="Mihajlo Grbovic" class="l de bw gw gx fe" src="https://miro.medium.com/v2/resize:fill:96:96/1*hdBvFFL4w9pznHLwQUJjdw.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bd b gy gz bi"><div class="ha ab q"><div><div class="bk" aria-hidden="false"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@mihajlo.grbovic?source=post_page-----35b78a837725--------------------------------"><div class="ab q">Mihajlo Grbovic</div></a></div></div><div class="hb hc hd he i d"></div></div><div class="ab q hg"><p class="pw-published-date bd b be z dk">Mar 22</p><div class="hh bk" aria-hidden="true">·</div><div class="pw-reading-time bd b be z dk">11 min read</div></div></div></div><div class="ab q"><div class="h k hi hj hk"><div class="hl l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hl l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hl l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hp ab q"></div></div></div><div class="hu s u j i d"><div class="fm l"><div class="hz l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hz l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="hz l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="l fo"><div><div class="bk" aria-hidden="false"></div></div><div class="bl l"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ij ik il im in"><div class=""><p id="539d" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">Airbnb Categories Blog Series — Part II : ML Categorization</p><p id="9adb" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">By: <strong class="jp ir">Mihajlo Grbovic, Pei Xiong, Pratiksha Kadam, Ying Xiao, Aaron Yin, Weiping Peng, Shukun Yang, Chen Qian, Haowei Zhang, Sebastien Dubois, Nate Ney, James Furnary, Mark Giangreco, Nate Rosenthal, Cole Baker, Bill Ulammandakh, Sid Reddy, Egor Pakhomov</strong></p><figure class="km kn ko kp gt kq gh gi paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gh gi kl"><picture></picture></div></div></figure><p id="d603" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi"><a class="ae kx" href="https://news.airbnb.com/2022-summer-release/" rel="noopener ugc nofollow" target="_blank">Airbnb 2022 release</a> introduced Categories, a browse focused product that allows the user to seek inspiration by browsing collections of homes revolving around a common theme, such as <em class="ky">Lakefront, Countryside, Golf, Desert, National Parks</em>, <em class="ky">Surfing</em>, etc. In <a class="ae kx" rel="noopener" href="https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-and-human-in-the-loop-e97988e70ebb">Part I</a> of our Categories Blog Series we covered the high level approach to creating Categories and showcasing them in the product. In this Part II we will describe the ML Categorization work in more detail.</p><p id="3e98" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">Throughout the post we use the <strong class="jp ir"><em class="ky">Lakefront</em> category</strong> as a running example to showcase the ML-powered category development process. Similar process was applied for other categories, with category specific nuances. For example, some categories rely more on points of interests, while others more on structured listing signals, image data, etc.</p><h2 id="835d" class="kz la iq bd lb lc ld dn le lf lg dp lh jy li lj lk kc ll lm ln kg lo lp lq lr bi"><strong class="ak">Category Definition</strong></h2><p id="ed4d" class="pw-post-body-paragraph jn jo iq jp b jq ls js jt ju lt jw jx jy lu ka kb kc lv ke kf kg lw ki kj kk ij bi">Category development starts with a product-driven category definition: “<em class="ky">Lakefront category should include listings that are less than 100 meters from the lake</em>”. While this may sound like an easy task at first, it is very delicate and complex as it involves leveraging multiple structured and unstructured listing attributes, points of interest (POIs), etc. It also involves training ML models that combine them, since none of the signals captures the entire space of possible candidates on their own.</p><h2 id="1e79" class="kz la iq bd lb lc ld dn le lf lg dp lh jy li lj lk kc ll lm ln kg lo lp lq lr bi">Listing Understanding Signals</h2><p id="0da8" class="pw-post-body-paragraph jn jo iq jp b jq ls js jt ju lt jw jx jy lu ka kb kc lv ke kf kg lw ki kj kk ij bi">As part of various past projects multiple teams at Airbnb spent time on processing different types of raw data to extract useful information in structured form. Our goal was to leverage these signals for cold-start rule-based category candidate generation and later use them as features of the ML model that could find category candidates with higher precision:</p><ul class=""><li id="108e" class="lx ly iq jp b jq jr ju jv jy lz kc ma kg mb kk mc md me mf bi"><strong class="jp ir">Host provided listing information</strong>, such as <strong class="jp ir"><em class="ky">property type</em></strong> (e.g. castle, houseboat), <strong class="jp ir"><em class="ky">amenities &amp; attributes</em></strong>(pool, fire pit, forest view, etc.). <strong class="jp ir"><em class="ky">listing</em></strong> <strong class="jp ir"><em class="ky">location</em></strong>, <strong class="jp ir"><em class="ky">title, description, image captions</em></strong> that can be scanned for keywords (we gathered exhaustive sets of keywords in different languages per category).</li><li id="e304" class="lx ly iq jp b jq mg ju mh jy mi kc mj kg mk kk mc md me mf bi"><a class="ae kx" href="https://www.airbnb.com/resources/hosting-homes/a/create-a-guidebook-to-share-your-local-tips-23" rel="noopener ugc nofollow" target="_blank"><strong class="jp ir">Host guidebooks</strong></a>, where hosts recommend nearby places for guests to visit (e.g. a Vineyard, Surf beach, Golf course) which hold locations data that was useful for extracting <strong class="jp ir"><em class="ky">POIs</em></strong></li><li id="90ac" class="lx ly iq jp b jq mg ju mh jy mi kc mj kg mk kk mc md me mf bi"><a class="ae kx" href="https://www.airbnb.com/s/experiences" rel="noopener ugc nofollow" target="_blank"><strong class="jp ir">Airbnb experiences</strong></a>, such as <em class="ky">Surfing</em>, <em class="ky">Golfing, Scuba</em>, etc. <strong class="jp ir"><em class="ky">Locations of these activities</em></strong> proved useful in identifying listing candidates for certain activity-related categories.</li><li id="b051" class="lx ly iq jp b jq mg ju mh jy mi kc mj kg mk kk mc md me mf bi"><strong class="jp ir">Guest reviews</strong>which is another source that can be scanned for <strong class="jp ir"><em class="ky">keywords</em></strong>. We also collect supplemental guest reviews where guests provide<strong class="jp ir"><em class="ky"> feedback on listings quality, amenities and attributes.</em></strong></li><li id="5e3c" class="lx ly iq jp b jq mg ju mh jy mi kc mj kg mk kk mc md me mf bi"><a class="ae kx" href="https://www.airbnb.com/wishlists/popular" rel="noopener ugc nofollow" target="_blank"><strong class="jp ir">Wishlists</strong></a> that guests create when browsing, such as “Golf trip 2022”, “Beachfront”, “Yosemite trip”, are often related to one of the categories, which proved useful for candidate generation.</li></ul><figure class="km kn ko kp gt kq gh gi paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gh gi ml"><picture></picture></div></div><figcaption class="mm mn gj gh gi mo mp bd b be z dk">Figure 1. Popular wishlists created by airbnb users</figcaption></figure><p id="491b" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">The listing understanding knowledge base was further enriched using external data, such as <strong class="jp ir">Satellite data</strong> (tell us if a listing is close to an ocean, river or lake), <strong class="jp ir">Climate, Geospatial data</strong>, <strong class="jp ir">Population data</strong> (tells us if listing is in rural, urban or metropolitan area) and <strong class="jp ir">POI data </strong>that contains names and locations of places of interest from host guidebooks or collected by us via open source datasets and further improved, enriched and adjusted by in-house human review.</p><p id="3744" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">Finally, we leveraged our in-house ML models for additional knowledge extraction from raw listing data. These included <strong class="jp ir">ML models for</strong><a class="ae kx" rel="noopener" href="https://medium.com/airbnb-engineering/amenity-detection-and-beyond-new-frontiers-of-computer-vision-at-airbnb-144a4441b72e"><strong class="jp ir"> Detecting amenities and objects in listing images</strong></a>, <a class="ae kx" rel="noopener" href="https://medium.com/airbnb-engineering/categorizing-listing-photos-at-airbnb-f9483f3ab7e3"><strong class="jp ir">Categorizing room types and outdoor spaces in listing images</strong></a>,, <a class="ae kx" rel="noopener" href="https://medium.com/airbnb-engineering/listing-embeddings-for-similar-listing-recommendations-and-real-time-personalization-in-search-601172f7603e"><strong class="jp ir">Computing embedding similarities between listings</strong></a> and <a class="ae kx" rel="noopener" href="https://medium.com/airbnb-engineering/when-a-picture-is-worth-more-than-words-17718860dcc2"><strong class="jp ir">Assessing property aesthetics</strong></a>. Each of these were useful in different stages of category development, candidate generation, expansion and quality prediction, respectively.</p><h2 id="c378" class="kz la iq bd lb lc ld dn le lf lg dp lh jy li lj lk kc ll lm ln kg lo lp lq lr bi"><strong class="ak">Rule-based candidate generation</strong></h2><p id="eeab" class="pw-post-body-paragraph jn jo iq jp b jq ls js jt ju lt jw jx jy lu ka kb kc lv ke kf kg lw ki kj kk ij bi">Once a category is defined, we first leverage pre-computed listing understanding signals and ML model outputs described in the previous section to codify the definition with a set of rules. Our candidate generation engine then applies them to produce a set of rule-based candidates and prioritizes them for human review based on a category confidence score.</p><p id="c2e5" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">This confidence score is computed based on how many signals qualified the listing to the category and the weights associated with each rule. For example, considering <em class="ky">Lakefront</em> category, vicinity to a Lake POIs carried the most weight, host provided signals on direct lake access were next more important, lakefront keywords found in listing title, description, wishlists, reviews carried less weight, while lake and water detection in listing images carried the least weight. A listing that would have all these attributes would have a very high confidence score, while a listing that would have only one would have a lower score.</p><h2 id="9a93" class="kz la iq bd lb lc ld dn le lf lg dp lh jy li lj lk kc ll lm ln kg lo lp lq lr bi"><strong class="ak">Human review process</strong></h2><p id="4a7b" class="pw-post-body-paragraph jn jo iq jp b jq ls js jt ju lt jw jx jy lu ka kb kc lv ke kf kg lw ki kj kk ij bi">Candidates were sent for human review daily, by selecting a certain number of listings from each category with the highest category confidence score. Human agents then judged if listing belongs to the category, choose the best cover photo and assessed the quality of the listing (Figure 3)</p><p id="d358" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">As human reviews started rolling in and there were enough listings with confirmed and rejected category tags it unlocked new candidate generation techniques that started contributing their own candidates:</p><ul class=""><li id="3b84" class="lx ly iq jp b jq jr ju jv jy lz kc ma kg mb kk mc md me mf bi"><strong class="jp ir">Proximity based: </strong>leveraging distance to the confirmed listing in a given category, e.g. neighbor of a confirmed <em class="ky">Lakefront</em> listing it may also be <em class="ky">Lakefront</em></li><li id="3e3d" class="lx ly iq jp b jq mg ju mh jy mi kc mj kg mk kk mc md me mf bi"><strong class="jp ir">Embedding similarity</strong>: leveraging listing embeddings to find listings that are most similar to confirmed listing in a given category.</li><li id="fd29" class="lx ly iq jp b jq mg ju mh jy mi kc mj kg mk kk mc md me mf bi"><strong class="jp ir">Training ML categorization</strong> <strong class="jp ir">models</strong>: once the agents reviewed 20% of rule-based candidates we started training ML models.</li></ul><p id="7729" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">In the beginning, only agent vetted listings were sent to production and featured on the homepage. Over time, as our candidate generation techniques produced more candidates and the feedback loop repeated, it allowed us to train better and better ML models with more labeled data. Finally, at some point, when ML models were good enough, we started sending listings with high enough model scores to production (Figure 2).</p><figure class="km kn ko kp gt kq gh gi paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gh gi mq"><picture></picture></div></div><figcaption class="mm mn gj gh gi mo mp bd b be z dk">Figure 2. Number of listings in production per category and fractions vetted by humans</figcaption></figure><h1 id="4571" class="mr la iq bd lb ms mt mu le mv mw mx lh my mz na lk nb nc nd ln ne nf ng lq nh bi">Aligning ML Models with Human review tasks</h1><p id="2a86" class="pw-post-body-paragraph jn jo iq jp b jq ls js jt ju lt jw jx jy lu ka kb kc lv ke kf kg lw ki kj kk ij bi">In order to scale the review process we trained ML models that mimic each of the three human agent tasks (Figure 3). In the following sections we will demonstrate the training and evaluation process involved with each model</p><figure class="km kn ko kp gt kq gh gi paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gh gi ni"><picture></picture></div></div><figcaption class="mm mn gj gh gi mo mp bd b be z dk">Figure 3. ML models setup for mimicking human review</figcaption></figure><h2 id="2bd0" class="kz la iq bd lb lc ld dn le lf lg dp lh jy li lj lk kc ll lm ln kg lo lp lq lr bi"><strong class="ak">ML Categorization Model</strong></h2><p id="8c30" class="pw-post-body-paragraph jn jo iq jp b jq ls js jt ju lt jw jx jy lu ka kb kc lv ke kf kg lw ki kj kk ij bi">ML Categorization Model task was to confidently place listings in a category. These models were trained using Bighead (Airbnb’s ML platform) as XGBoost binary<em class="ky"> per category </em>classification models. They used agent category assignments as labels and signals described in the Listing Understanding section as features. As opposed to a rule-based setting, ML models allowed us to have better control of the precision of candidates via model score threshold.</p><p id="9973" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">Although many features are shared across categories and one could train a single multiclass model, due to the high imbalance in category sizes and dominance of category-specific features we found it better to train dedicated ML per category models. Another big reason for this was that a major change to a single category, such as change in definition, large addition of new POIs or labels, did not require us to retrain, launch and measure impact on all the categories, but instead conveniently work on a single category in isolation.</p><p id="5c9a" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi"><strong class="jp ir">Lakefront ML model</strong></p><p id="20f4" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi"><strong class="jp ir">Features</strong>: the first step was to build features, with the most important one being distance to Lake POI. We started with collecting Lake POIs represented as a single point and later added lake boundaries that trace the lake, which greatly improved the accuracy of being able to pull listings near the boundary. However, as shown in Figure 4, even then there were many edge cases that lead to mistakes in rule-based listing assignment.</p><div class="km kn ko kp gt ab cb"><figure class="nj kq nk nl nm nn no paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure><figure class="nj kq np nl nm nn no paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure><figure class="nj kq nq nl nm nn no paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div><figcaption class="mm mn gj gh gi mo mp bd b be z dk nr di ns nt">Figure 4. Examples of imperfect POI (left) and complex geography: highway between lake and home (middle), long backyards (right)</figcaption></figure></div><p id="d3da" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">These include imperfect lake boundaries that can be inside the water or outside on land, highways in between lake and houses, houses on cliffs, imperfect listing location, missing POIs, and POIs that are not actual lakes, like reservoirs, ponds etc. For this reason, it proved beneficial to combine POI data with other listing signals as ML model features and then use the model to proactively improve the Lake POI database.</p><p id="eabe" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">One modeling maneuver that proved to be useful here was <strong class="jp ir">feature dropout</strong>. Since most of the features were also used for generating rule-based candidates that were graded by agents, resulting in labels that are used by the ML model, there was a risk of overfitting and limited pattern discovery beyond the rules.</p><p id="b9b1" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">To address this problem, during training we would randomly drop some feature signals, such as distance from Lake POI, from some listings. As a result, the model did not over rely on the dominant POI feature, which allowed listings to have a high ML score even if they are not close to any known Lake POI. This allowed us to find missing POIs and add them to our database.</p><p id="4efd" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi"><strong class="jp ir">Labels</strong>: <strong class="jp ir">Positive labels </strong>were assigned to listings agents tagged as <em class="ky">Lakefront</em>, <strong class="jp ir">Negative labels </strong>were assigned to listings sent for review as <em class="ky">Lakefront</em> candidates but rejected (<strong class="jp ir">Hard negatives </strong>from modeling perspective). We also sampled negatives from related <em class="ky">Lake House </em>categorythat allows greater distance to lake (<strong class="jp ir">Easier negatives</strong>) and listings tagged in other categories (<strong class="jp ir">Easiest negatives</strong>)</p><p id="a1f3" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi"><strong class="jp ir">Train / Test split:</strong> 70:30 random split, where we had special handling of distance and embedding similarity features not to leak the label.</p><div class="km kn ko kp gt ab cb"><figure class="nj kq nu nl nm nn no paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure><figure class="nj kq nv nl nm nn no paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div><figcaption class="mm mn gj gh gi mo mp bd b be z dk nw di nx nt">Figure 5. Lakefront ML model feature importance and performance evaluation</figcaption></figure></div><p id="6784" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">We trained several models using different feature subsets. We were interested in how well POI data can do on its own and what improvements can additional signals provide. As it can be observed in Figure 5, the POI distance is the most important feature by far. However, when used on its own it cannot approach the ML model performance. Specifically, the ML model improves Average Precision by 23%, from 0.74 to 0.91, which confirmed our hypothesis.</p><p id="c2f0" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">Since the POI feature is the most important feature we invested in improving it by adding new POIs and refining existing POIs. This proved to be beneficial as the ML model using <em class="ky">improved</em> POI features greatly outperforms the model that used <em class="ky">initial</em> POI features (Figure 5).</p><p id="a56a" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">The process of Lake POI refinement included leveraging trained ML model to<strong class="jp ir"> find missing or imperfect POIs</strong> by inspecting listings that have a high model score but are far from existing Lake POIs (Figure 6 left) and <strong class="jp ir">removing wrong POIs</strong> by inspecting listings that have a low model score but are very close to an existing Lake POI (Figure 6 right)</p><div class="km kn ko kp gt ab cb"><figure class="nj kq ny nl nm nn no paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure><figure class="nj kq nz nl nm nn no paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div><figcaption class="mm mn gj gh gi mo mp bd b be z dk oa di ob nt">Figure 6. Process of finding missing POIs (Left) and wrong POIs (Right)</figcaption></figure></div><p id="b36f" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi"><strong class="jp ir">Sending confident listings to production: </strong>using the test set Precision-Recall curve we found a threshold that achieves 90% Precision. We used this threshold to make a decision on which candidates can go directly to production and which need to be sent for human review first.</p><h2 id="17f8" class="kz la iq bd lb lc ld dn le lf lg dp lh jy li lj lk kc ll lm ln kg lo lp lq lr bi"><strong class="ak">Cover Image ML model</strong></h2><p id="6ffd" class="pw-post-body-paragraph jn jo iq jp b jq ls js jt ju lt jw jx jy lu ka kb kc lv ke kf kg lw ki kj kk ij bi">To carry out the second agent task with ML, we needed to train a different type of ML model. One whose task would be to choose the most appropriate listing cover photo given the category context. For example, choosing a listing photo with a lake view for the Lakefront category.</p><p id="f359" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">We tested several out of the box object detection models as well as several in-house solutions trained using human review data, i.e. (listing id, category, cover photo id) tuples. We found that the best cover photo selection accuracy was achieved by fine-tuning a <a class="ae kx" href="https://huggingface.co/google/vit-base-patch16-224-in21k" rel="noopener ugc nofollow" target="_blank">Vision Transformer model</a> (VT) using our human review data. Once trained, the model can score all listing photos and decide which one is the best cover photo for a given category.</p><p id="8fef" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">To evaluate the model we used a hold out dataset and tested if the agent selected listing photo for a particular category was within the top 3 highest scoring VT model photos for the same category. The average Top 3 precision on all categories was 70%, which we found satisfactory.</p><p id="5d0a" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">To further test the model we judged if the VT selected photo represented the category better than the Host selected cover photo (Figure 7). It was found that the VT model can select a better photo in 77% of the cases. It should be noted that the Host selected cover photo is typically chosen without taking any category into account, as the one that best represents the listing in the search feed.</p><figure class="km kn ko kp gt kq gh gi paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gh gi ml"><picture></picture></div></div><figcaption class="mm mn gj gh gi mo mp bd b be z dk">Figure 7. Vision Transformer vs. Host selected cover photo selection for the same listing for Lakefront category</figcaption></figure><p id="d3d1" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">In addition to selecting the best cover photo for candidates that are sent to production by the ML categorization model, the VT model was also used to speed up the human review process. By ordering the candidate listing photos in descending order of the VT score we were able to improve the time it takes the agents to make a decision on a category and cover photo by 18%.</p><p id="550e" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">Finally, for some highly visual categories, such as <em class="ky">Design</em>, <em class="ky">Creative spaces</em>, the VT model proved to be useful for direct candidate generation.</p><h2 id="f0bd" class="kz la iq bd lb lc ld dn le lf lg dp lh jy li lj lk kc ll lm ln kg lo lp lq lr bi">Quality ML Model</h2><p id="e146" class="pw-post-body-paragraph jn jo iq jp b jq ls js jt ju lt jw jx jy lu ka kb kc lv ke kf kg lw ki kj kk ij bi">The final human review task is to judge the quality of the listing by selecting one of the four tiers: Most Inspiring, High Quality, Acceptable, Low Quality. As we will discuss in Part III of the blog series, the quality plays a role in ranking of listings in the search feed.</p><p id="8f17" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">To train an ML model that can predict quality of a listing we used a combination of engagement, quality and visual signals to create a feature set and agent quality tags to create labels. The features included review ratings, wishlists, image quality, embedding signals and listing amenities and attributes, such as price, number of guests, etc.</p><p id="becc" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">Given the multi-class setup with four quality tiers, we experimented with different loss functions (pairwise loss, one-vs-all, one-vs-one, multi label, etc.). We then compared the ROC curves of different strategies on a hold-out set and the binary one-vs-all models performed the best.</p><div class="km kn ko kp gt ab cb"><figure class="nj kq oc nl nm nn no paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure><figure class="nj kq od nl nm nn no paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div><figcaption class="mm mn gj gh gi mo mp bd b be z dk oe di of nt">Figure 8: Quality ML model feature importance and ROC curve</figcaption></figure></div><p id="7483" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">In addition to playing a role in search ranking, the Quality ML score also played a role in the human review prioritization logic. With all three ML models functional for all three human review tasks, we could now streamline the review process and send more candidates directly to production, while also prioritizing some for human review. This prioritization plays an important role in the system because listings that are vetted by humans may rank higher in the category feed.</p><p id="d8c1" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">There were several factors to consider when prioritizing listings for human review, including listing category confidence score, listing quality, bookability and popularity of the region. The best strategy proved to be a combination of those factors. In Figure 9 we show the top candidates for human review for several categories at the time of writing this post.</p><figure class="km kn ko kp gt kq gh gi paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gh gi ml"><picture></picture></div></div></figure><figure class="km kn ko kp gt kq gh gi paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gh gi ml"><picture></picture></div></div><figcaption class="mm mn gj gh gi mo mp bd b be z dk">Figure 9: Listing prioritized for review in 4 different categories</figcaption></figure><p id="8367" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">Once graded, those labels are then used for periodical model re-training in an active feedback loop that continuously improves the category accuracy and coverage.</p><h1 id="9c8e" class="mr la iq bd lb ms mt mu le mv mw mx lh my mz na lk nb nc nd ln ne nf ng lq nh bi">Future work</h1><p id="c150" class="pw-post-body-paragraph jn jo iq jp b jq ls js jt ju lt jw jx jy lu ka kb kc lv ke kf kg lw ki kj kk ij bi">Our future work involves iterating on the three ML models in several directions, including generating a larger set of labels using generative vision models and potentially combining them into a single multi-task model. We are also exploring ways of using Large Language Models (LLMs) for conducting category review tasks</p><p id="8b54" class="pw-post-body-paragraph jn jo iq jp b jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk ij bi">If this type of work interests you, check out some of our related <a class="ae kx" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">roles</a>!</p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-human-in-the-loop-35b78a837725</link>
      <guid>https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-human-in-the-loop-35b78a837725</guid>
      <pubDate>Wed, 22 Mar 2023 23:03:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Prioritizing Home Attributes Based on Guest Interest]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gj gk gl gm gn go gp gq gr gs l"><div class="ab gt gu"><div class="ab"><div class="fl l"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@joy.jing1?source=post_page-----3c49b827e51a--------------------------------"><div class="l di"><img alt="Joy Jing" class="l de bw gv gw fd" src="https://miro.medium.com/fit/c/96/96/1*fOnHPQ1BRg7ZFNg7bm5QzQ.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bd b gx gy bi"><div class="gz ab q"><div><div class="bk" aria-hidden="false"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@joy.jing1?source=post_page-----3c49b827e51a--------------------------------"><div class="ab q">Joy Jing</div></a></div></div><div class="ha hb hc hd i d"></div></div><div class="ab q hf"><p class="pw-published-date bd b be z dk">Feb 16</p><div class="hg bk" aria-hidden="true">·</div><div class="pw-reading-time bd b be z dk">7 min read</div></div></div></div><div class="ab q"><div class="h k hh hi hj"><div class="hk l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hk l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hk l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="ho ab q"></div></div></div><div class="ht s u j i d"><div class="fl l"><div class="hy l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hy l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hy l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="bl l"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ii ij ik il im"><div class=""><p id="bad4" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><strong class="jo iq">How Airbnb leverages ML to derive guest interest from unstructured text data and provide personalized recommendations to Hosts</strong></p><figure class="kl km kn ko gs kp gg gh paragraph-image"><div role="button" tabindex="0" class="kq kr di ks bf kt"><div class="gg gh kk"><picture></picture></div></div></figure><p id="4e01" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><strong class="jo iq">By: </strong><a class="ae kw" href="https://www.linkedin.com/in/joyjing1/" rel="noopener ugc nofollow" target="_blank"><strong class="jo iq">Joy Jing</strong></a><strong class="jo iq"> and </strong><a class="ae kw" href="https://www.linkedin.com/in/jing-julia-xia-029b3a123/" rel="noopener ugc nofollow" target="_blank"><strong class="jo iq">Jing Xia</strong></a></p><p id="48ef" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">At Airbnb, we endeavor to build a world where anyone can belong anywhere. We strive to understand what our guests care about and match them with Hosts who can provide what they are looking for. What better source for guest preferences than the guests themselves?</p><p id="5269" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">We built a system called the <strong class="jo iq">Attribute Prioritization System</strong> (APS) to listen to our guests’ needs in a home: What are they requesting in messages to Hosts? What are they commenting on in reviews? What are common requests when calling customer support? And how does it differ by the home’s location, property type, price, as well as guests’ travel needs?</p><p id="a06e" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">With this personalized understanding of what home amenities, facilities, and location features (i.e. “home attributes”) matter most to our guests, we advise Hosts on which home attributes to acquire, merchandize, and verify. We can also display to guests the home attributes that are most relevant to their destination and needs.</p><p id="e3b4" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">We do this through a scalable, platformized, and data-driven engineering system. This blog post describes the science and engineering behind the system.</p><p id="48c7" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><strong class="jo iq">What do guests care about?</strong></p><p id="ce78" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">First, to determine what matters most to our guests in a home, we look at what guests request, comment on, and contact customer support about the most. Are they asking a Host whether they have wifi, free parking, a private hot tub, or access to the beach?</p><p id="abc5" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">To parse this unstructured data at scale, Airbnb built <strong class="jo iq">LATEX</strong> (<strong class="jo iq">L</strong>isting <strong class="jo iq">AT</strong>tribute <strong class="jo iq">EX</strong>traction), a machine learning system that can extract home attributes from unstructured text data like guest messages and reviews, customer support tickets, and listing descriptions. LATEX accomplishes this in two steps:</p><ol class=""><li id="62e4" class="kx ky ip jo b jp jq jt ju jx kz kb la kf lb kj lc ld le lf bi">A <strong class="jo iq">named entity recognition (NER) module</strong> extracts key phrases from unstructured text data</li><li id="63f8" class="kx ky ip jo b jp lg jt lh jx li kb lj kf lk kj lc ld le lf bi">An <strong class="jo iq">entity mapping module</strong> then maps these key phrases to home attributes</li></ol><p id="2fd5" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">The <a class="ae kw" href="https://spacy.io/usage/linguistic-features#named-entities" rel="noopener ugc nofollow" target="_blank">named entity recognition (NER)</a> module uses <a class="ae kw" href="https://arxiv.org/pdf/1408.5882.pdf" rel="noopener ugc nofollow" target="_blank">textCNN (convolutional neural network for text)</a> and is trained and fine tuned on human labeled text data from various data sources within Airbnb. In the training dataset, we label each phrase that falls into the following five categories: Amenity, Activity, Event, Specific POI (i.e. “Lake Tahoe”), or generic POI (i.e. “post office”).</p><p id="d45d" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">The entity mapping module uses an unsupervised learning approach to map these phrases to home attributes. To achieve this, we compute the cosine distance between the candidate phrase and the attribute label in the fine-tuned word embedding space. We consider the closest mapping to be the referenced attribute, and can calculate a confidence score for the mapping.</p><p id="f5cd" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">We then calculate how frequently an entity is referenced in each text source (i.e. messages, reviews, customer service tickets), and aggregate the normalized frequency across text sources. Home attributes with many mentions are considered more important.</p><p id="1c12" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">With this system, we are able to gain insight into what guests are interested in, even highlighting new entities that we may not yet support. The scalable engineering system also allows us to improve the model by onboarding additional data sources and languages.</p><figure class="kl km kn ko gs kp gg gh paragraph-image"><div role="button" tabindex="0" class="kq kr di ks bf kt"><div class="gg gh ll"><picture></picture></div></div><figcaption class="lm ln gi gg gh lo lp bd b be z dk">An example of a listing’s description with keywords highlighted and labeled by the Latex NER model.</figcaption></figure><p id="47e5" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><strong class="jo iq">What do guests care about for different types of homes?</strong></p><p id="bf1e" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">What guests look for in a mountain cabin is different from an urban apartment. Gaining a more complete understanding of guests’ needs in an Airbnb home enables us to provide more personalized guidance to Hosts.</p><p id="035c" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">To achieve this, we calculate a unique ranking of attributes for each home. Based on the characteristics of a home–location, property type, capacity, luxury level, etc–we predict how frequently each attribute will be mentioned in messages, reviews, and customer service tickets. We then use these predicted frequencies to calculate a customized importance score that is used to rank all possible attributes of a home.</p><p id="ba5b" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">For example, let us consider a mountain cabin that can host six people with an average daily price of $50. In determining what is most important for potential guests, we learn from what is most talked about for other homes that share these same characteristics. The result: hot tub, fire pit, lake view, mountain view, grill, and kayak. In contrast, what’s important for an urban apartment are: parking, restaurants, grocery stores, and subway stations.</p><figure class="kl km kn ko gs kp gg gh paragraph-image"><div role="button" tabindex="0" class="kq kr di ks bf kt"><div class="gg gh lq"><picture></picture></div></div><figcaption class="lm ln gi gg gh lo lp bd b be z dk"><strong class="bd lr">Image:</strong> An example image of a mountain cabin home</figcaption></figure><figure class="kl km kn ko gs kp gg gh paragraph-image"><div class="gg gh ls"><picture></picture></div><figcaption class="lm ln gi gg gh lo lp bd b be z dk">An example of home attributes ranked for a mountain cabin vs an urban apartment.</figcaption></figure><figure class="kl km kn ko gs kp gg gh paragraph-image"><div role="button" tabindex="0" class="kq kr di ks bf kt"><div class="gg gh lq"><picture></picture></div></div><figcaption class="lm ln gi gg gh lo lp bd b be z dk"><strong class="bd lr">Image:</strong> An example of an urban apartment home</figcaption></figure><p id="9862" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">We could directly aggregate the frequency of keyword usage amongst similar homes. But this approach would run into issues at scale; the cardinality of our home segments could grow exponentially large, with sparse data in very unique segments. Instead, we built an inference model that uses the raw keyword frequency data to infer the expected frequency for a segment. This inference approach is scalable as we use finer and more dimensions to characterize our homes. This allows us to support our Hosts to best highlight their unique and diverse collection of homes.</p><p id="550d" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><strong class="jo iq">How can guests’ preferences help Hosts improve?</strong></p><p id="3883" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Now that we have a granular understanding of what guests want, we can help Hosts showcase what guests are looking for by:</p><ul class=""><li id="b8c0" class="kx ky ip jo b jp jq jt ju jx kz kb la kf lb kj lt ld le lf bi">Recommending that Hosts acquire an amenity guests often request (i.e. coffee maker)</li><li id="190b" class="kx ky ip jo b jp lg jt lh jx li kb lj kf lk kj lt ld le lf bi">Merchandizing an existing home attribute that guests tend to comment favorably on in reviews (i.e. patio)</li><li id="956d" class="kx ky ip jo b jp lg jt lh jx li kb lj kf lk kj lt ld le lf bi">Clarifying popular facilities that may end up in requests to customer support (i.e. the privacy and ability to access a pool)</li></ul><p id="ca29" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">But to make these recommendations relevant, it’s not enough to know what guests want. We also need to be sure about what’s already in the home. This turns out to be trickier than asking the Host due to the 800+ home attributes we collect. Most Hosts aren’t able to immediately and accurately add all of the attributes their home has, especially since amenities like a crib mean different things to different people. To fill in some of the gaps, we leverage guests feedback for amenities and facilities they have seen or used. In addition, some home attributes are available from trustworthy third parties, such as real estate or geolocation databases that can provide square footage, bedroom count, or if the home is overlooking a lake or beach. We’re able to build a truly complete picture of a home by leveraging data from our Hosts, guests, and trustworthy third parties.</p><p id="9001" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">We utilize several different models, including a Bayesian inference model that increases in confidence as more guests confirm that the home has an attribute. We also leverage a supervised neural network WiDeText machine learning model that uses features about the home to predict the likelihood that the next guest will confirm the attribute’s existence.</p><p id="670f" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Together with our estimate of how important certain home attributes are for a home, and the likelihood that the home attribute already exists or needs clarification, we are able to give personalized and relevant recommendations to Hosts on what to acquire, merchandize, and clarify when promoting their home on Airbnb.</p><figure class="kl km kn ko gs kp gg gh paragraph-image"><div role="button" tabindex="0" class="kq kr di ks bf kt"><div class="gg gh lu"><picture></picture></div></div><figcaption class="lm ln gi gg gh lo lp bd b be z dk">Cards shown to Hosts to better promote their listings.</figcaption></figure><p id="83ff" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><strong class="jo iq">What’s next?</strong></p><p id="982b" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">This is the first time we’ve known what attributes our guests want down to the home level. What’s important varies greatly based on home location and trip type.</p><p id="4db2" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">This full-stack prioritization system has allowed us to give more relevant and personalized advice to Hosts, to merchandize what guests are looking for, and to accurately represent popular and contentious attributes. When Hosts accurately describe their homes and highlight what guests care about, guests can find their perfect vacation home more easily.</p><p id="f1c7" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">We are currently experimenting with highlighting amenities that are most important for each type of home (i.e. kayak for mountain cabin, parking for urban apartment) on the home’s product description page. We believe we can leverage the knowledge gained to improve search and to determine which home attributes are most important for different categories of homes.</p><p id="9ae1" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">On the Host side, we’re expanding this prioritization methodology to encompass additional tips and insights into how Hosts can make their listings even more desirable. This includes actions like freeing up popular nights, offering discounts, and adjusting settings. By leveraging unstructured text data to help guests connect with their perfect Host and home, we hope to foster a world where anyone can belong anywhere.</p><p id="f05c" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">If this type of work interests you, check out some of our related positions at <a class="ae kw" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">Careers at Airbnb</a>!</p><h1 id="f464" class="lv lw ip bd lr lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm mn mo mp mq mr bi">Acknowledgments</h1><p id="c8b3" class="pw-post-body-paragraph jm jn ip jo b jp ms jr js jt mt jv jw jx mu jz ka kb mv kd ke kf mw kh ki kj ii bi">It takes a village to build such a robust full-stack platform. Special thanks to (alphabetical by last name) <a class="ae kw" href="https://www.linkedin.com/in/uabbasi/" rel="noopener ugc nofollow" target="_blank">Usman Abbasi</a>, <a class="ae kw" href="https://www.linkedin.com/in/deanchen1/" rel="noopener ugc nofollow" target="_blank">Dean Chen</a>, <a class="ae kw" href="https://www.linkedin.com/in/guillaumeguy/" rel="noopener ugc nofollow" target="_blank">Guillaume Guy</a>, <a class="ae kw" href="https://www.linkedin.com/in/noah-hendrix-2b148366/" rel="noopener ugc nofollow" target="_blank">Noah Hendrix</a>, <a class="ae kw" href="https://www.linkedin.com/in/hwlical/" rel="noopener ugc nofollow" target="_blank">Hongwei Li</a>, <a class="ae kw" href="https://www.linkedin.com/in/xiao-l-593679194/" rel="noopener ugc nofollow" target="_blank">Xiao Li</a>, <a class="ae kw" href="https://www.linkedin.com/in/saraxliu/" rel="noopener ugc nofollow" target="_blank">Sara Liu</a>, <a class="ae kw" href="https://www.linkedin.com/in/qianru-ma-91850749/" rel="noopener ugc nofollow" target="_blank">Qianru Ma</a>, <a class="ae kw" href="https://www.linkedin.com/in/dan-nguyen-b8817a34/" rel="noopener ugc nofollow" target="_blank">Dan Nguyen</a>, <a class="ae kw" href="https://www.linkedin.com/in/nguyenmartin/" rel="noopener ugc nofollow" target="_blank">Martin Nguyen</a>, <a class="ae kw" href="https://www.linkedin.com/in/brennanpolley/" rel="noopener ugc nofollow" target="_blank">Brennan Polley</a>, <a class="ae kw" href="https://www.linkedin.com/in/pontefederico/" rel="noopener ugc nofollow" target="_blank">Federico Ponte</a>, <a class="ae kw" href="https://www.linkedin.com/in/jose-toti-rodriguez-0b840463/" rel="noopener ugc nofollow" target="_blank">Jose Rodriguez</a>, <a class="ae kw" href="https://www.linkedin.com/in/peng-wang-13117371/" rel="noopener ugc nofollow" target="_blank">Peng Wang</a>, <a class="ae kw" href="https://www.linkedin.com/in/rongru-yan-7077a036/" rel="noopener ugc nofollow" target="_blank">Rongru Yan</a>, <a class="ae kw" href="https://www.linkedin.com/in/meng-yu-b013011/" rel="noopener ugc nofollow" target="_blank">Meng Yu</a>, <a class="ae kw" href="https://www.linkedin.com/in/luzhangtracy/" rel="noopener ugc nofollow" target="_blank">Lu Zhang</a> for their contributions, dedication, expertise, and thoughtfulness!</p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/prioritizing-home-attributes-based-on-guest-interest-3c49b827e51a</link>
      <guid>https://medium.com/airbnb-engineering/prioritizing-home-attributes-based-on-guest-interest-3c49b827e51a</guid>
      <pubDate>Thu, 16 Feb 2023 18:05:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Learning To Rank Diversely]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gj gk gl gm gn go gp gq gr gs l"><div class="ab gt gu"><div class="ab"><div class="fl l"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@malay-haldar?source=post_page-----add6b1929621--------------------------------"><div class="l di"><img alt="Malay Haldar" class="l de bw gv gw fd" src="https://miro.medium.com/fit/c/96/96/1*ZRBnWC44v-zE41Xe6LwIug.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bd b gx gy bi"><div class="gz ab q cb"><div><div class="bk" aria-hidden="false"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@malay-haldar?source=post_page-----add6b1929621--------------------------------">Malay Haldar</a></div></div><div class="ha hb hc hd i d"></div></div><div class="ab q hf"><p class="pw-published-date bd b be z dk">Jan 30</p><div class="hg bk" aria-hidden="true">·</div><div class="pw-reading-time bd b be z dk">6 min read</div></div></div></div><div class="ab q"><div class="h k hh hi hj"><div class="hk l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hk l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hk l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="ho ab q"></div></div></div><div class="ht s u j i d"><div class="fl l"><div class="hy l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hy l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hy l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="bl l"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ii ij ik il im"><div class=""><p id="dbeb" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">by <a class="ae kk" rel="noopener" href="https://medium.com/@malay.haldar">Malay Haldar</a>, Liwei He &amp; <a class="ae kk" rel="noopener" href="https://medium.com/@mooseabdool">Moose Abdool</a></p><figure class="km kn ko kp gs kq gg gh paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gg gh kl"><picture></picture></div></div></figure><p id="1cc6" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Airbnb connects millions of guests and Hosts everyday. Most of these connections are forged through search, the results of which are determined by a neural network–based ranking algorithm. While this neural network is adept at selecting <em class="kx">individual listings</em> for guests, we recently improved the neural network to better select the overall <em class="kx">collection of listings</em> that make up a search result. In this post, we dive deeper into this recent breakthrough that enhances the diversity of listings in search results.</p><h1 id="3345" class="ky kz ip bd la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv bi">How Does Ranking Work?</h1><p id="8b0f" class="pw-post-body-paragraph jm jn ip jo b jp lw jr js jt lx jv jw jx ly jz ka kb lz kd ke kf ma kh ki kj ii bi">The ranking neural network finds the best listings to surface for a given query by comparing two listings at a time and predicting which one has the higher probability of getting booked. To generate this probability estimate, the neural network places different weights on various listing attributes such as price, location and reviews. These weights are then refined by comparing booked listings against not-booked listings from search logs, with the objective of assigning higher probabilities to booked listings over the not-booked ones.</p><p id="e283" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">What does the ranking neural network learn in the process? As an example, a concept the neural network picks up is that lower prices are preferred. This is illustrated in the figure below, which plots increasing price on the x-axis and its corresponding effect on normalized model scores on the y-axis. Increasing price makes model scores go down, which makes intuitive sense since the majority of bookings at Airbnb skew towards the economical range.</p><figure class="km kn ko kp gs kq gg gh paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gg gh mb"><picture></picture></div></div><figcaption class="mc md gi gg gh me mf bd b be z dk"><em class="mg">Relation between model scores and percent price increase</em></figcaption></figure><p id="6c0a" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">But price is not the only feature for which the model learns such concepts. Other features such as the listing’s distance from the query location, number of reviews, number of bedrooms, and photo quality can all exhibit such trends. Much of the complexity of the neural network is in balancing all these various factors, tuning them to the best possible tradeoffs that fit all cities and all seasons.</p><h1 id="14a2" class="ky kz ip bd la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv bi">Can One Size Fit All?</h1><p id="92e5" class="pw-post-body-paragraph jm jn ip jo b jp lw jr js jt lx jv jw jx ly jz ka kb lz kd ke kf ma kh ki kj ii bi">The way the ranking neural network is constructed, its booking probability estimate for a listing is determined by how many guests in the past have booked listings with similar combinations of price, location, reviews, etc. The notion of higher booking probability essentially translates to what the majority of guests have preferred in the past. For instance, there is a strong correlation between high booking probabilities and low listing prices. The booking probabilities are tailored to location, guest count and trip length, among other factors. However, within that context, the ranking algorithm up-ranks listings that the largest fraction of the guest population would have preferred. This logic is repeated for each position in the search result, so the entire search result is constructed to favor the majority preference of guests. We refer to this as the <em class="kx">Majority principle</em> in ranking — the overwhelming tendency of the ranking algorithm to follow the majority at every position.</p><p id="ffd9" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">But majority preference isn’t the best way to represent the preferences of the entire guest population. Continuing with our discussion of listing prices, we look at the distribution of booked prices for a popular destination — Rome — and specifically focus on two night trips for two guests. This allows us to focus on price variations due to listing quality alone, and eliminate most of other variabilities. Figure below plots the distribution.</p><figure class="km kn ko kp gs kq gg gh paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gg gh mh"><picture></picture></div></div><figcaption class="mc md gi gg gh me mf bd b be z dk"><em class="mg">Pareto principle: 50/50 split of booking value corresponds to roughly 80/20 split of bookings</em></figcaption></figure><p id="f655" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">The x-axis corresponds to booking values in USD, log-scale. Left y-axis is the number of bookings corresponding to each price point on the x-axis. The orange shape confirms the log-normal distribution of booking value. The red line plots the percentage of total bookings in Rome that have booking value less than or equal to the corresponding point on x-axis, and the green line plots the percentage of total booking value for Rome covered by those bookings. Splitting total booking value 50/50 splits bookings into two unequal groups of ~80/20. In other words, 20% of bookings account for 50% of booking value. For this 20% minority, cheaper is not necessarily better, and their preference leans more towards quality. This demonstrates the <em class="kx">Pareto principle</em>, a coarse view of the heterogeneity of preference among guests.</p><p id="0082" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">While the Pareto principle suggests the need to accommodate a wider range of preferences, the Majority principle summarizes what happens in practice. When it comes to search ranking, the Majority principle is at odds with the Pareto principle.</p><h1 id="3556" class="ky kz ip bd la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv bi">Diversifying by Reducing Similarity</h1><p id="61ca" class="pw-post-body-paragraph jm jn ip jo b jp lw jr js jt lx jv jw jx ly jz ka kb lz kd ke kf ma kh ki kj ii bi">The lack of diversity of listings in search results can alternatively be viewed as listings being too similar to each other. Reducing inter-listing similarity, therefore, can remove some of the listings from search results that are redundant choices to begin with. For instance, instead of dedicating every position in the search result to economical listings, we can use some of the positions for quality listings. The challenge here is how to quantify this inter-listing similarity, and how to balance it against the base booking probabilities estimated by the ranking neural network.</p><p id="c897" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">To solve this problem, we build another neural network, a companion to the ranking neural network. The task of this companion neural network is to estimate the similarity of a given listing to previously placed listings in a search result.</p><p id="ede0" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">To train the similarity neural network, we construct the training data from logged search results. All search results where the booked listing appears as the top result are discarded. For the remaining search results, we set aside the top result as a special listing, called the antecedent listing. Using listings from the second position onwards, we create pairs of booked and not-booked listings. This is summarized in the figure below.</p><figure class="km kn ko kp gs kq gg gh paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gg gh mi"><picture></picture></div></div><figcaption class="mc md gi gg gh me mf bd b be z dk"><em class="mg">Construction of training examples from logged search results</em></figcaption></figure><p id="065a" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">We then train a ranking neural network to assign a higher booking probability to the booked listing compared to the not-booked listing, but with a modification — we subtract the output of the similarity neural network that supplies a similarity estimate between the given listing vs the antecedent listing. The reasoning here is that guests who skipped the antecedent listing and then went on to book a listing from results down below must have picked something that is dissimilar to the antecedent listing. Otherwise, they would have booked the antecedent listing itself.</p><p id="e445" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Once trained, we are ready to use the similarity network for ranking listings online. During ranking, we start by filling the top-most result with the listing that has the highest booking probability. For subsequent positions, we select the listing that has the highest booking probability amongst the remaining listings, after discounting its similarity to the listings already placed above. The search result is constructed iteratively, with each position trying to be diverse from all the positions above it. Listings too similar to the ones already placed effectively get down-ranked as illustrated below.</p><figure class="km kn ko kp gs kq gg gh paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gg gh mj"><picture></picture></div></div><figcaption class="mc md gi gg gh me mf bd b be z dk"><em class="mg">Reranking of listings based on similarity to top results</em></figcaption></figure><p id="811f" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Following this strategy led to one of the most impactful changes to ranking in recent times. We observed an increase of 0.29% in uncancelled bookings, along with a 0.8% increase in booking value. The increase in booking value is far greater than the increase in bookings because the increase is dominated by high-quality listings which correlate with higher value. Increase in booking value provides us with a reliable proxy to measure increase in quality, although increase in booking value is not the target. We also observed some direct evidence of increase in quality of bookings — a 0.4% increase in 5-star ratings, indicating higher guest satisfaction for the entire trip.</p><h1 id="026e" class="ky kz ip bd la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv bi">Further Reading</h1><p id="b8e1" class="pw-post-body-paragraph jm jn ip jo b jp lw jr js jt lx jv jw jx ly jz ka kb lz kd ke kf ma kh ki kj ii bi">We discussed reducing similarity between listings to improve the overall utility of search results and cater to diverse guest preferences. While intuitive, to put the idea in practice we need a rigorous foundation in machine learning, which is described in <a class="ae kk" href="https://arxiv.org/pdf/2210.07774.pdf" rel="noopener ugc nofollow" target="_blank">our technical paper</a>. Up next, we are looking deeper into the location diversity of results. We welcome all comments and suggestions for the technical paper and the blog post.</p><p id="8c10" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><em class="kx">Interested in working at Airbnb? Check out </em><a class="ae kk" href="https://careers.airbnb.com/positions/" rel="noopener ugc nofollow" target="_blank"><em class="kx">these open roles</em></a><em class="kx">.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/learning-to-rank-diversely-add6b1929621</link>
      <guid>https://medium.com/airbnb-engineering/learning-to-rank-diversely-add6b1929621</guid>
      <pubDate>Mon, 30 Jan 2023 21:54:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Making Airbnb’s Android app more accessible]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gj gk gl gm gn go gp gq gr gs l"><div class="ab gt gu"><div class="ab"><div class="fl l"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@chengxiaofu?source=post_page-----75618172be6--------------------------------"><div class="l di"><img alt="Julia Fu" class="l de bw gv gw fd" src="https://miro.medium.com/fit/c/96/96/1*2Ee12UYfxMo47NWLGas7Gw.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bd b gx gy bi"><div class="gz ab cb"><div><div class="bk" aria-hidden="false"><div class="ab q"><a class="ae af ag ah ai aj ak al am an ao ap aq ar as" rel="noopener follow" href="https://medium.com/@chengxiaofu?source=post_page-----75618172be6--------------------------------">Julia Fu</a></div></div></div><div class="ha hb hc hd i d"></div></div><div class="ab q hf"><p class="pw-published-date bd b be z dk">Jan 11</p><div class="hg bk" aria-hidden="true">·</div><div class="pw-reading-time bd b be z dk">7 min read</div></div></div></div><div class="ab q"><div class="h k hh hi hj"><div class="hk l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hk l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hk l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="ho ab q"></div></div></div><div class="ht s u j i d"><div class="fl l"><div class="hy l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hy l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="hy l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="l fn"><div><div class="bk" aria-hidden="false"></div></div><div class="bl l"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ii ij ik il im"><div class=""><p id="a7f8" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><strong class="jo iq">By:</strong> <a class="ae kk" href="https://www.linkedin.com/in/julia-fu-3844b712/" rel="noopener ugc nofollow" target="_blank">Julia Fu</a>, <a class="ae kk" href="https://www.linkedin.com/in/peter-elliott-777125144/" rel="noopener ugc nofollow" target="_blank">Peter Elliott</a></p><figure class="km kn ko kp gs kq gg gh paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><div class="gg gh kl"><picture></picture></div></div></figure><p id="6fa9" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">At Airbnb, we have been consciously designing and building products to be equally usable by all users. Making our mobile apps and websites more accessible not only aligns with our company’s mission of creating a world where people can belong anywhere, but also supports the civil rights of people with disabilities and complies with the law.</p><p id="5b97" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">In this article, we highlight some of the efforts we have made to make the app more accessible, for example, labeling UI elements, grouping related content, supporting large font scale, providing heading and page names. The Airbnb app is one of the most popular travel apps with millions of users and supports many features. Making such a complex app more accessible is a huge endeavor that we are continuously working on.</p><h1 id="2c2e" class="kx ky ip bd kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu bi">Part I: Build for all: best practices we apply</h1><p id="a78d" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">At Airbnb, we follow industry best practices to make the Android app accessible. If you are interested, you can find all best practices we follow from the <a class="ae kk" href="https://developer.android.com/guide/topics/ui/accessibility/principles" rel="noopener ugc nofollow" target="_blank">official Android documentation</a> for platform specific guidelines and the <a class="ae kk" href="https://www.w3.org/WAI/standards-guidelines/wcag/" rel="noopener ugc nofollow" target="_blank">Web Content Accessibility Guidelines</a> as an industry standard. Here we want to highlight a few examples where we apply the best practices:</p><h2 id="52be" class="ma ky ip bd kz mb mc dn ld md me dp lh jx mf mg ll kb mh mi lp kf mj mk lt ml bi">Best Practice: content descriptions</h2><p id="48e0" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">Everything shall have accurate content descriptions unless they should be ignored by assistive technology. In these examples, the share button has a content description that TalkBack reads aloud. TalkBack skips the house icon.</p><div class="km kn ko kp gs ab cb"><figure class="mm kq mn mo mp mq mr paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure><figure class="mm kq mn mo mp mq mr paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure></div><h2 id="d8fd" class="ma ky ip bd kz mb mc dn ld md me dp lh jx mf mg ll kb mh mi lp kf mj mk lt ml bi">Best practice: grouping</h2><p id="41c8" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">Elements of a natural group can be announced together with focusable containers for better usability and accuracy. For instance, Talkback reads all listing content on the card together.</p><figure class="km kn ko kp gs kq gg gh paragraph-image"><div class="gg gh ms"><picture></picture></div></figure><h2 id="450b" class="ma ky ip bd kz mb mc dn ld md me dp lh jx mf mg ll kb mh mi lp kf mj mk lt ml bi">Best practice: font scale</h2><p id="ae88" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">UI shall be usable when the user increases the system font scale.</p><p id="9f2c" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Default vs enlarged font scale:</p><div class="km kn ko kp gs ab cb"><figure class="mm kq mn mo mp mq mr paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure><figure class="mm kq mn mo mp mq mr paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div><figcaption class="mt mu gi gg gh mv mw bd b be z dk mx di my mz">Default font scale on the left. Enlarged font scale on the right.</figcaption></figure></div><h2 id="1439" class="ma ky ip bd kz mb mc dn ld md me dp lh jx mf mg ll kb mh mi lp kf mj mk lt ml bi">Scaling best practices</h2><p id="f1da" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">The Airbnb Android app is a large app with many screens. It would be exhausting and not scalable if we needed to add accessibility code everywhere. Fortunately, our <a class="ae kk" href="https://airbnb.design/the-way-we-build/" rel="noopener ugc nofollow" target="_blank">Design Language System</a> enables us to broadly apply these best practices across product surfaces in a highly efficient way. Every screen is built with a collection of reusable UI components. When we improve the accessibility for one component, the change applies to all the pages with this component as part of the view. This has a long-lasting positive effect on our app’s accessibility improvements. Here’s an example:</p><div class="km kn ko kp gs ab cb"><figure class="mm kq mn mo mp mq mr paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure><figure class="mm kq mn mo mp mq mr paragraph-image"><div role="button" tabindex="0" class="kr ks di kt bf ku"><picture></picture></div></figure></div><p id="8ca0" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Take <em class="na">SectionHeader</em> as an example. This UI component is used to communicate the structure on the page and group content together. We mark this component to be an accessibility heading in the component code so it is accessible in all screens that contain this component.</p><figure class="km kn ko kp gs kq"><div class="bz fo l di"></div></figure><h1 id="0f6e" class="kx ky ip bd kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu bi">Part II: Empower engineers with automated checks</h1><p id="9e07" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">We invested in automated accessibility testing and linting to run with every code commit, which creates a quick feedback loop for engineers and empowers them to make the app accessible at code writing time. The checks are fast, reliable, and scale well with our fast-growing features in the Android app.</p><h2 id="c61b" class="ma ky ip bd kz mb mc dn ld md me dp lh jx mf mg ll kb mh mi lp kf mj mk lt ml bi">Automated testing</h2><p id="dd15" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">We set up Espresso-based automated testing to check for accessibility issues. <a class="ae kk" href="https://developer.android.com/guide/topics/ui/accessibility/testing#espresso" rel="noopener ugc nofollow" target="_blank">Espresso</a> is a popular testing library for Android UI with built-in accessibility checks. It supports a comprehensive set of accessibility rules and is easy to set up:</p><figure class="km kn ko kp gs kq"><div class="bz fo l di"></div></figure><p id="52cf" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">If accessibility checks fail, the test outputs an error stack trace that engineers can use to debug the issue. For example:</p><figure class="km kn ko kp gs kq"><div class="bz fo l di"></div></figure><p id="cf9d" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">In this example, engineers can provide a content description to the image view to satisfy accessibility requirements.</p><p id="228d" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">We also screenshot test our components with a larger font size to ensure the behavior is correct using <a class="ae kk" rel="noopener" href="https://medium.com/airbnb-engineering/better-android-testing-at-airbnb-a77ac9531cab">Happo</a>.</p><figure class="km kn ko kp gs kq gg gh paragraph-image"><div class="gg gh ms"><picture></picture></div></figure><h2 id="c2c0" class="ma ky ip bd kz mb mc dn ld md me dp lh jx mf mg ll kb mh mi lp kf mj mk lt ml bi">Linting</h2><p id="19f8" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">In addition to automated testing, we also enabled linting, including <a class="ae kk" href="https://developer.android.com/studio/write/lint" rel="noopener ugc nofollow" target="_blank">Android Lint</a> rules for accessibility and custom lint rules built with <a class="ae kk" href="https://github.com/pinterest/ktlint" rel="noopener ugc nofollow" target="_blank">Ktlint</a>.</p><p id="1b3c" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Here is an example of an Android accessibility lint rule:</p><figure class="km kn ko kp gs kq"><div class="bz fo l di"></div></figure><p id="77c3" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Besides the built-in Android Lint, we also use Ktlint to build custom lint rules. For instance, when a user navigates to a new screen, we provide a page name for a screen reader to announce. We use the following rule to make sure that the page name is localized.</p><figure class="km kn ko kp gs kq"><div class="bz fo l di"></div></figure><p id="6f17" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Lint rules are straightforward to set up and provide timely feedback, but linting has limitations — it can only perform static code analysis.</p><p id="82f2" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">Today, these automated checks run as part of CI (Continuous Integration) checks for every code commit. If a pull request does not pass the checks, it will be blocked from being merged into the primary code branch. We still use manual testing to cover the areas that automated checks do not cover, such as the traversal order of UI elements on a page. Automated and manual checks complement each other well.</p><h1 id="912e" class="kx ky ip bd kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu bi">Part III: Looking into the future: Accessibility with Compose</h1><p id="4dc1" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">Over the past year, we have been integrating Jetpack Compose into our app. Google’s <a class="ae kk" href="https://developer.android.com/jetpack/compose/accessibility" rel="noopener ugc nofollow" target="_blank">Accessibility in Compose documentation</a> has been a great resource to ensure our Compose components and screens remain accessible. While there are some notable things missing that existed with Views (e.g. focus order modification), Compose is still a young library and we look forward to future improvements. Here are a couple of things worth mentioning about our Compose-specific accessibility tooling:</p><h2 id="b7c2" class="ma ky ip bd kz mb mc dn ld md me dp lh jx mf mg ll kb mh mi lp kf mj mk lt ml bi">Proactively encourage content descriptions in the API</h2><p id="da5d" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">One of our guidelines for UI components is that content descriptions exposed via a function parameter should not use a default value. This brings accessibility to the top of mind when an engineer uses the component as they need to consider what value to pass. A null value is still acceptable in cases where that UI element is not important for accessibility.</p><figure class="km kn ko kp gs kq gg gh paragraph-image"><div class="gg gh ms"><picture></picture></div></figure><figure class="km kn ko kp gs kq"><div class="bz fo l di"></div></figure><h2 id="af9b" class="ma ky ip bd kz mb mc dn ld md me dp lh jx mf mg ll kb mh mi lp kf mj mk lt ml bi">Page name announcements</h2><figure class="km kn ko kp gs kq gg gh paragraph-image"><div class="gg gh ms"><picture></picture></div></figure><p id="4eb3" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">When using Fragments and Views, we use the <em class="na">View.setAccessibilityPaneTitle()</em> and <em class="na">View.announceForAccessibility()</em> APIs when navigating to a new screen to announce a descriptive page name to the user. These APIs do not exist in Compose but we wanted to keep the functionality since it helps to provide more context as to what the new screen displays. Our current workaround sets certain semantics on the screen’s outer composable:</p><figure class="km kn ko kp gs kq"><div class="bz fo l di"></div></figure><p id="fdac" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">We use the <em class="na">liveRegion</em> property so changes can be announced when the content description changes. This is useful for pages whose entire content is determined by a response from the server. In this case, TalkBack would announce “Content Loading” while the network request is pending, followed by “Content Loaded” when it completes, and finally the page description defined in the server response. One downside of this approach is that it requires the outer container to be focusable, which requires an additional navigation action to get to the content.</p><h1 id="8069" class="kx ky ip bd kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu bi">Closing thoughts</h1><p id="1363" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">Making our Android app more accessible has been an impactful journey. Improving app accessibility involves following best practices, adding rigorous enforcements, continually learning from mistakes, and putting in the work. All of these are worthy efforts to make sure an app works for all users.</p><p id="6537" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi">If you are excited about building highly accessible products and the framework to support them, check out some of our related open positions:</p><p id="4b73" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><a class="ae kk" href="https://careers.airbnb.com/positions/4590099/" rel="noopener ugc nofollow" target="_blank">Staff Android Software Engineer, Guest</a></p><p id="8766" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><a class="ae kk" href="https://careers.airbnb.com/positions/4648432/" rel="noopener ugc nofollow" target="_blank">Senior iOS Software Engineer, Infrastructure</a></p><h1 id="7e52" class="kx ky ip bd kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu bi">Acknowledgments</h1><p id="6bc6" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi">It is a huge endeavor to make a complex app like the Airbnb Android app more accessible. This work wouldn’t be possible without the enormous efforts from the digital accessibility team and the close-knit Android community at Airbnb. Every engineer has contributed to making the features they own accessible. Making the Android app more accessible is an ongoing effort and it could not succeed without all of them.</p><h1 id="282d" class="kx ky ip bd kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu bi">****************</h1><p id="c8dc" class="pw-post-body-paragraph jm jn ip jo b jp lv jr js jt lw jv jw jx lx jz ka kb ly kd ke kf lz kh ki kj ii bi"><em class="na">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p><p id="f8bc" class="pw-post-body-paragraph jm jn ip jo b jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj ii bi"><em class="na">All bookings included in this blog post are intended to illustrate. Airbnb does not endorse or promote these listings or any other accommodations or experiences on the platform.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/making-airbnbs-android-app-more-accessible-75618172be6</link>
      <guid>https://medium.com/airbnb-engineering/making-airbnbs-android-app-more-accessible-75618172be6</guid>
      <pubDate>Wed, 11 Jan 2023 20:09:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[When a Picture Is Worth More Than Words]]></title>
      <description><![CDATA[<header class="pw-post-byline-header go gp gq gr gs gt gu gv gw gx l"><div class="o gy u"><div class="o"><div class="fj l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@caoyuanpei?source=post_page-----17718860dcc2--------------------------------"><div class="l do"><img alt="Yuanpei Cao" class="l ch fl gz ha fp" src="https://miro.medium.com/fit/c/96/96/1*wZhTYXwJYILN7S_XlD52-w.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bm b dm dn ga"><div class="hb o hc"><div><div class="ci" aria-hidden="false"><div class="o ao"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@caoyuanpei?source=post_page-----17718860dcc2--------------------------------">Yuanpei Cao</a></div></div></div><div class="hd he hf hg hh d"></div></div><div class="o ao ht"><p class="pw-published-date bm b bn bo cn">Dec 8</p><div class="hu ci" aria-hidden="true">·</div><div class="pw-reading-time bm b bn bo cn">8 min read</div></div></div></div><div class="o ao"><div class="h k hv hw hx"><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ib o ao"></div><div class="ck ih"></div></div></div><div class="ii ij ik j i d"><div class="fj l"><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="iz ja jb jc jd"><div class=""><p id="a75d" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">How Airbnb uses visual attributes to enhance the Guest and Host experience</p><p id="0fc0" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><em class="lb">By </em><a class="au lc" href="https://www.linkedin.com/in/yuanpei-cao-792b103b/" rel="noopener ugc nofollow" target="_blank"><em class="lb">Yuanpei Cao</em></a><em class="lb">, </em><a class="au lc" href="https://www.linkedin.com/in/bulam/" rel="noopener ugc nofollow" target="_blank"><em class="lb">Bill Ulammandakh</em></a><em class="lb">, </em><a class="au lc" href="https://www.linkedin.com/in/hao-wang-2661553/" rel="noopener ugc nofollow" target="_blank"><em class="lb">Hao Wang</em></a><em class="lb">, and </em><a class="au lc" href="https://www.linkedin.com/in/hwangtt/" rel="noopener ugc nofollow" target="_blank"><em class="lb">Tony Hwang</em></a></p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm ld"><picture></picture></div></div></figure><h1 id="1253" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga"><strong class="ba">Introduction</strong></h1><p id="0aa3" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga">On Airbnb, our hosts share unique listings all over the world. There are hundreds of millions of accompanying listing photos on Airbnb. Listing photos contain crucial information about style and design aesthetics that are difficult to convey in words or a fixed list of amenities. Accordingly, multiple teams at Airbnb are now leveraging computer vision to extract and incorporate intangibles from our rich visual data to help guests easily find listings that suit their preferences.</p><p id="8ed3" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In previous blog posts titled <a class="au lc" rel="noopener" href="https://medium.com/airbnb-engineering/widetext-a-multimodal-deep-learning-framework-31ce2565880c"><em class="lb">WIDeText: A Multimodal Deep Learning Framework</em></a>,<a class="au lc" rel="noopener" href="https://medium.com/airbnb-engineering/categorizing-listing-photos-at-airbnb-f9483f3ab7e3"><em class="lb">Categorizing Listing Photos at Airbnb</em></a> and <a class="au lc" rel="noopener" href="https://medium.com/airbnb-engineering/amenity-detection-and-beyond-new-frontiers-of-computer-vision-at-airbnb-144a4441b72e"><em class="lb">Amenity Detection and Beyond — New Frontiers of Computer Vision at Airbnb</em></a>, we explored how we utilize computer vision for room categorization and amenity detection to map listing photos to a taxonomy of discrete concepts. This post goes beyond discrete categories into how Airbnb leverages image aesthetics and embeddings to optimize across various product surfaces including ad content, listing presentation, and listing recommendations.</p><h1 id="0875" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga">Image aesthetics</h1><p id="3295" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga">Attractive photos are as vital as price, reviews, and description during a guest’s Airbnb search journey. To quantify “attractiveness” of photos, we developed a deep learning-based image aesthetics assessment pipeline. The underlying model is a deep convolutional neural network (<a class="au lc" href="https://en.wikipedia.org/wiki/Convolutional_neural_network" rel="noopener ugc nofollow" target="_blank">CNN</a>) trained on human-labeled image aesthetic rating distributions. Each photo was rated on a scale from 1 to 5 by hundreds of photographers based on their personal aesthetic measurements (the higher the rating, the better the aesthetic). Unlike traditional classification tasks that classify the photo into low, medium and high-quality categories, the model was built upon the Earth Mover’s Distance (<a class="au lc" href="https://en.wikipedia.org/wiki/Earth_mover%27s_distance" rel="noopener ugc nofollow" target="_blank">EMD</a>) as the loss function to predict photographers’ rating distributions.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm ms"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 1. The model that predicts image aesthetics distribution is CNN-based and trained with the EMD loss function. Suppose the ground truth label of a photo is: 10% of users give ratings 1 and 2, respectively, 20% give rating 3, and 30% give ratings 4 and 5, respectively. The corresponding prediction is [0.1, 0.1, 0.2, 0.3, 0.3]</em></figcaption></figure><p id="a8f4" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">The predicted mean rating is highly correlated with image resolution and listing booking probability, as well as high-end Airbnb listing photo distribution. Rating thresholds are set based on use cases, such as ad photo recommendation on social media and photo order suggestion in the listing onboarding process.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm mx"><picture></picture></div></div></figure><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm my"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 2. Examples of Airbnb listing photos with aesthetics scores higher than the 90% percentile</em></figcaption></figure><h1 id="f022" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga">Image aesthetic-based ads quality improvement</h1><p id="9b29" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga">Airbnb uses advertising on social media to attract new customers and inspire our community. The social media platform chooses which ads to run based on millions of Airbnb-provided listing photos.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm mz"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 3. Airbnb Ads displayed on Facebook</em></figcaption></figure><p id="b3d3" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Since a visually appealing Airbnb photo can effectively attract users to the platform and considerably increase the ad’s click-through rate (CTR), we utilized the image aesthetic score and room categorization to select the most attractive Airbnb photos of the living room, bedroom, kitchen, and exterior view. The criterion for “good quality” listing photos was set based on the top 50th percentile of the aesthetic score and tuned based on an internal manual aesthetic evaluation of 1K randomly selected listing cover photos. We performed A/B testing for this use case and found that the ad candidates with a higher aesthetic score generated a substantially higher CTR and booking rate.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm na"><picture></picture></div></div></figure><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm na"><picture></picture></div></div></figure><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm na"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 4. Pre-selected Airbnb Creative Ads through image aesthetics and room type filters</em></figcaption></figure><h1 id="ac06" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga">Automated photo ranking based on home design and room type</h1><p id="86de" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga">When posting a new listing on Airbnb, hosts upload numerous photos. Optimally arranging these photos to highlight a home can be time-consuming and challenging. A host may also be uncertain about the ideal arrangement for their images because the work requires making trade-offs between photo attractiveness, photo diversity, and content relevance to guests. More specifically, the first five photos are the most important for listing success as they are the most frequently viewed and crucial to forming the initial guest impression. Accordingly, we developed an automated photo ranking algorithm that selects and orders the first five photos of a home leveraging two visual signals: home design evaluation and room categorization.</p><p id="f99a" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Home design evaluation estimates how well a home is designed from an interior design and architecture perspective. The CNN-based home design evaluation model is trained on Airbnb<em class="lb"> Plus </em>and<em class="lb"> Luxe</em> qualification data that assess the aesthetic appeal of each photo’s home design. Airbnb <em class="lb">Plus</em> and <em class="lb">Luxe</em> listings have passed strict home design evaluation criteria and so the data from their qualification process is well-suited to be used as training labels for a home design evaluation model. The photos are then classified into different room types, such as living room, bedroom, bathroom etc, through the room categorization model. Finally, an algorithm makes trade-offs between photo home design attractiveness, photo relevance, and photo diversity to maximize the booking probability of a home. Below is an example of how a new photo order is suggested. The photo auto-rank feature was launched in Host’s listing onboarding product in 2021, leading to significant lifts in new listing creation and booking success.</p><p id="4666" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Original ordering</strong></p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm ms"><picture></picture></div></div></figure><p id="22c8" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Auto-suggested ordering</strong></p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm ms"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 5. The example of original photo order (top) uploaded by Airbnb Host and auto-suggested order (bottom) calculated by the proposed algorithm</em></figcaption></figure><h1 id="d1cd" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga">Image similarity</h1><p id="c8b8" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga">Beyond aesthetics, photos also capture the general appearance and content. To efficiently represent this information, we encode and compress photos into image embeddings using computer vision models. Image embeddings are compact vector representations of images that represent visual features. These embeddings can be compared against each other with a distance metric that represents similarity in that feature space.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm ms"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 6. Image embeddings can be compared by distance metrics like cosine similarity to represent their similarity in the encoded latent space</em></figcaption></figure><p id="ed1b" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">The features learned by the encoder are directly influenced by the training image data distribution and training objectives. Our labeled room type and amenity classification data allows us to train models on this data distribution to produce semantically meaningful embeddings for listing photo similarity use cases. However, as the quantity and diversity of images on Airbnb grow, it becomes increasingly untenable to rely solely on manually labeled data and supervised training techniques. Consequently, we are currently exploring self-supervised contrastive training to improve our image embedding models. This form of training does not require image labels; instead, it bootstraps contrastive learning with synthetically generated positive and negative pairs. Our image embedding models can then learn key visual features from listing photos without manual supervision.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm ms"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 7. Introducing random image transformations to synthetically create positive and negative pairs helps refine our image encoders without additional labeling.</em></figcaption></figure><h1 id="c7d2" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga">Scalable embedding search</h1><p id="0f7f" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga">It is often impractical to compute exhaustive pairwise embedding similarity, even within focused subsets of millions of items. To support real-time search use cases, such as (near) duplicate photo detection and visual similarity search, we instead perform an approximate nearest neighbor (<a class="au lc" href="https://en.wikipedia.org/wiki/Nearest_neighbor_search#Approximate_nearest_neighbor" rel="noopener ugc nofollow" target="_blank">ANN</a>) search. This functionality is largely enabled by an efficient embedding index preprocessing and construction algorithm called Hierarchical Navigable Small World (<a class="au lc" href="https://arxiv.org/abs/1603.09320" rel="noopener ugc nofollow" target="_blank">HNSW</a>). HNSW builds a hierarchical proximity graph structure that greatly constrains the search space at query time. We scale this horizontally with AWS OpenSearch, where each node contains its own HNSW embedding graphs and Lucene-backed indices that are hydrated periodically and can be queried in parallel. To add real-time embedding ANN search, we have implemented the following index hydration and index search design patterns enabled by existing Airbnb internal platforms.</p><p id="48fa" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">To hydrate an embedding index on a periodic basis, all relevant embeddings computed by <a class="au lc" href="https://ieeexplore.ieee.org/document/8964147" rel="noopener ugc nofollow" target="_blank">Bighead</a>, Airbnb’s end-to-end machine learning platform, are aggregated and persisted into a Hive table. The encoder models producing the embeddings are deployed for both online inference and offline batch processing. Then, the incremental embedding update is synced to the embedding index on AWS OpenSearch through Airflow, our data pipeline orchestration service.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm ms"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 8. Index hydration data pathway</em></figcaption></figure><p id="5798" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">To perform image search, a client service will first verify whether the image’s embedding exists in the OpenSearch index cache to avoid recomputing embeddings unnecessarily. If the embedding is already there, the OpenSearch cluster can return approximate nearest neighbor results to the client without further processing. If there is a cache miss, Bighead is called to compute the image embedding, followed by a request to query the OpenSearch cluster for approximate nearest neighbors.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm nb"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 9. Image similarity search for a previously unseen image</em></figcaption></figure><p id="42c4" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Following this embedding search framework, we are scaling real-time visual search in current production flows and upcoming releases.</p><h1 id="8f3a" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga">Expanding Airbnb categories</h1><p id="6612" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga"><a class="au lc" href="https://www.airbnb.com/2022-summer" rel="noopener ugc nofollow" target="_blank">Airbnb Categories</a> help our guests discover unique getaways. Some examples are “Amazing views”, “Historical homes”, and “Creative spaces”. These categories do not always share common amenities or discrete attributes, as they often represent an inspirational concept. We are exploring automatic category expansion by identifying similar listings based on their photos, which do capture design aesthetics.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm nc"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 10. Listing photos from the “Creative spaces” category</em></figcaption></figure><h1 id="c78a" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga">Similar listing recommendations in rebooking assistance</h1><p id="f164" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga">In the 2022 Summer Release, Airbnb introduced rebooking assistance to offer guests a smooth experience from Community Support ambassadors when a Host cancels on short notice. For the purpose of recommending comparable listings throughout the rebooking process, a two-tower reservation and listing embedding model ranks candidate listings, updated on a daily basis. As future work, we can consider augmenting the listing representation with image embeddings and enabling real-time search.</p><figure class="le lf lg lh gx li gl gm paragraph-image"><div role="button" tabindex="0" class="lj lk do ll ce lm"><div class="gl gm ms"><picture></picture></div></div><figcaption class="mt bl gn gl gm mu mv bm b bn bo cn"><em class="mw">Figure 11. The example of a landing page that recommends similar listings to guests and Community Support ambassadors in the Rebooking assistance.</em></figcaption></figure><h1 id="f637" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga">Conclusion</h1><p id="bcf3" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga">Photos contain aesthetic and style-related signals that are difficult to express in words or map to discrete attributes. Airbnb is increasingly leveraging these visual attributes to help our hosts highlight the unique character of their listings and to assist our guests in discovering listings that match their preferences.</p><p id="be59" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Interested in working at Airbnb? Check out our <a class="au lc" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">open roles</a>.</p><h1 id="0077" class="lp lq jg bm lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm ga">Acknowledgements</h1><p id="6ba4" class="pw-post-body-paragraph kd ke jg kf b kg mn ki kj kk mo km kn ko mp kq kr ks mq ku kv kw mr ky kz la iz ga">Thanks to Teng Wang, Regina Wu, Nan Li, Do-kyum Kim, Tiantian Zhang, Xiaohan Zeng, Mia Zhao, Wayne Zhang, Elaine Liu, Floria Wan, David Staub, Tong Jiang, Cheng Wan, Guillaume Guy, Wei Luo, Hanchen Su, Fan Wu, Pei Xiong, Aaron Yin, Jie Tang, Lifan Yang, Lu Zhang, Mihajlo Grbovic, Alejandro Virrueta, Brennan Polley, Jing Xia, Fanchen Kong, William Zhao, Caroline Leung, Meng Yu, Shijing Yao, Reid Andersen, Xianjun Zhang, Yuqi Zheng, Dapeng Li, and Juchuan Ma for the product collaborations. Also thanks Jenny Chen, Surashree Kulkarni, and Lauren Mackevich for editing.</p><p id="4f00" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Thanks to Ari Balogh, Tina Su, Andy Yasutake, Joy Zhang, Kelvin Xiong, Raj Rajagopal, and Zhong Ren’s leadership support on building computer vision products at Airbnb.</p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/when-a-picture-is-worth-more-than-words-17718860dcc2</link>
      <guid>https://medium.com/airbnb-engineering/when-a-picture-is-worth-more-than-words-17718860dcc2</guid>
      <pubDate>Thu, 08 Dec 2022 19:03:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Motion Engineering at Scale]]></title>
      <description><![CDATA[<header class="pw-post-byline-header go gp gq gr gs gt gu gv gw gx l"><div class="o gy u"><div class="o"><div class="fj l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@calstephens98?source=post_page-----5ffabfc878--------------------------------"><div class="l do"><img alt="Cal Stephens" class="l ch fl gz ha fp" src="https://miro.medium.com/fit/c/96/96/0*IiSO0Uuw0yTYVQbE.jpg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bm b dm dn ga"><div class="hb o hc"><div><div class="ci" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@calstephens98?source=post_page-----5ffabfc878--------------------------------">Cal Stephens</a></div></div><div class="hd he hf hg hh d"></div></div><div class="o ao ht"><p class="pw-published-date bm b bn bo cn">Dec 7</p><div class="hu ci" aria-hidden="true">·</div><div class="pw-reading-time bm b bn bo cn">8 min read</div></div></div></div><div class="o ao"><div class="h k hv hw hx"><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ib o ao"></div><div class="ck ih"></div></div></div><div class="ii ij ik j i d"><div class="fj l"><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="iz ja jb jc jd"><div class=""><p id="29af" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">How Airbnb is applying declarative design patterns to rapidly build fluid transition animations</p><p id="f819" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">By: </strong><a class="au lb" href="https://www.linkedin.com/in/calstephens/" rel="noopener ugc nofollow" target="_blank">Cal Stephens</a></p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm lc"><picture></picture></div></div></figure><p id="08ab" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Motion is a key part of what makes a digital experience both easy and delightful to use. Fluid transitions between states and screens are key for helping the user preserve context as they navigate throughout a feature. Quick flourishes of animation make an app come alive, and help give it a distinct personality.</p><p id="4fc0" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">At Airbnb we launch hundreds of features and experiments that have been developed by engineers across many teams. When building at this scale, it’s critical to consider efficiency and maintainability throughout our tech stack–and motion is no exception. Adding animations to a feature needs to be fast and easy. The tooling must compliment and fit naturally with other components of our feature architecture. If an animation takes too long to build or is too difficult to integrate with the overall feature architecture, then it’s often the first part of a product experience that gets dropped when translating from design to implementation.</p><p id="c148" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In this post, we’ll discuss a new framework for iOS that we’ve created to help make this vision a reality.</p><h1 id="2c22" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Imperative UIKit Transitions</h1><p id="4212" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Let’s consider this transition on the Airbnb app’s homepage, which takes users from search results to an expanded search input screen:</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div class="gl gm mr"><picture></picture></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">An example transition from Airbnb’s iOS app of expanding and collapsing the search input screen</figcaption></figure><p id="d3d5" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">The transition is a key part of the design, making the entire search experience feel cohesive and lightweight.</p><p id="2df2" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Within traditional UIKit patterns, there are two ways to build a transition like this. One is to create a single, massive view controller that contains both the search results and the search input screens, and orchestrates a transition between the two states using imperative <a class="au lb" href="https://developer.apple.com/documentation/uikit/uiview/1622418-animate" rel="noopener ugc nofollow" target="_blank"><em class="mv">UIView</em> animation blocks</a>. While this approach is easy to build, it has the downside of tightly coupling these two screens, making them far less maintainable and portable.</p><p id="6d55" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">The other approach is to implement each screen as a separate view controller, and create a bespoke <a class="au lb" href="https://developer.apple.com/documentation/uikit/uiviewcontrolleranimatedtransitioning?language=objc" rel="noopener ugc nofollow" target="_blank"><em class="mv">UIViewControllerAnimatedTransitioning</em></a> implementation that extracts relevant views from each view hierarchy and then animates them. This is typically more complicated to implement, but has the key benefit of letting each individual screen be built as a separate <em class="mv">UIViewController</em> like you would for any other feature.</p><p id="97e0" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In the past, we’ve built transitions with both of these approaches, and found that they both typically require hundreds of lines of fragile, imperative code. This meant custom transitions were time consuming to build and difficult to maintain, so they were typically not included as part of a team’s main feature development flow.</p><p id="3667" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">A common trend has been to move away from this sort of <em class="mv">imperative</em> system design and towards <em class="mv">declarative</em> patterns. We use declarative systems extensively at Airbnb–we leverage frameworks like <a class="au lb" rel="noopener" href="https://medium.com/airbnb-engineering/introducing-epoxy-for-ios-6bf062be1670">Epoxy</a> and SwiftUI to declaratively define the layout of each screen. Screens are combined into features and flows using <a class="au lb" href="https://github.com/airbnb/epoxy-ios#epoxypresentations" rel="noopener ugc nofollow" target="_blank">declarative</a> <a class="au lb" href="https://github.com/airbnb/epoxy-ios#epoxynavigationcontroller" rel="noopener ugc nofollow" target="_blank">navigation</a> APIs. We’ve found these declarative systems unlock substantial productivity gains, by letting engineers focus on defining how the app should behave and abstracting away the complex underlying implementation details.</p><h1 id="93b1" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Declarative Transition Animations</h1><p id="8253" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">To simplify and speed-up the process of adding transitions to our app, we’ve created a <strong class="kf jh">new</strong> <strong class="kf jh">framework for building transitions declaratively</strong>, rather than imperatively as we did before. We’ve found that this new approach has made it much simpler to build custom transitions, and as a result far more engineers have been able to easily add rich and delightful transitions to their screens even on tight timelines<strong class="kf jh"><em class="mv">.</em></strong></p><p id="4f13" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">To perform a transition with this framework, you simply provide the <em class="mv">initial</em> state and <em class="mv">final</em> state (or in the case of a screen transition, the <em class="mv">source</em> and <em class="mv">destination</em> view controllers<em class="mv">)</em> along with a declarative <em class="mv">transition definition</em> of how each individual element on the screen should be animated. The framework’s generic <em class="mv">UIViewControllerAnimatedTransitioning</em> implementation handles everything else automatically.</p><p id="0393" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">This new framework has become instrumental to how we build features. It powers many of the new features included in Airbnb’s <a class="au lb" href="https://news.airbnb.com/2022-summer-release/" rel="noopener ugc nofollow" target="_blank">2022 Summer Release</a>and <a class="au lb" href="https://www.airbnb.com/2022-winter" rel="noopener ugc nofollow" target="_blank">2022 Winter Release</a><em class="mv">,</em> helping make them easy and delightful to use:</p><div class="ld le lf lg gx o hc"><figure class="mw lh mx my mz na nb paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><picture></picture></div></figure><figure class="mw lh nc my mz na nb paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><picture></picture></div></figure><figure class="mw lh nd my mz na nb paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><picture></picture></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn ne do nf ng">Example transitions in Airbnb’s iOS app from new features introduced in 2022</figcaption></figure></div><p id="5cd2" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">As an introduction, let’s start with a example. Here’s a simple “search” interaction where a date picker in a bottom sheet slides up over a page of content:</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div class="gl gm mr"><picture></picture></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">An example transition for a simple “search” feature</figcaption></figure><p id="a32e" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In this example, there are two separate view controllers: the search results screen and the date picker screen. Each of the components we want to animate are tagged with an identifier to establish their identity.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nh"><picture></picture></div></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn"><em class="ni">Diagram showing the search results screen and date picker screen annotated with component identifiers</em></figcaption></figure><p id="1e20" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">These identifiers let us refer to each component semantically by name, rather than by directly referencing the <em class="mv">UIView </em>instance. For example, the <em class="mv">Explore.searchNavigationBarPill</em> component on each screen is a separate <em class="mv">UIView</em> instance,but since they’re tagged with the same identifier the two view instances are considered separate “states” of the same component.</p><p id="258b" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Now that we’ve identified the components that we want to animate, we can define <em class="mv">how</em> they should animate. For this transition we want:</p><ol class=""><li id="7dff" class="nj nk jg kf b kg kh kk kl ko nl ks nm kw nn la no np nq nr ga">The background to fade in</li><li id="770a" class="nj nk jg kf b kg ns kk nt ko nu ks nv kw nw la no np nq nr ga">The bottom sheet to slide up from the bottom of the screen</li><li id="e87a" class="nj nk jg kf b kg ns kk nt ko nu ks nv kw nw la no np nq nr ga">The navigation bar to animate between the first state and second state (a “shared element” animation).</li></ol><p id="6098" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We can express this as a simple transition definition:</p><pre class="ld le lf lg gx nx ny nz ir oa ob ga">let transitionDefinition: TransitionDefinition = [  BottomSheet.backgroundView: .crossfade,  BottomSheet.foregroundView: .edgeTranslation(.bottom),  Explore.searchNavigationBarPill: .sharedElement,]</pre><p id="254e" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Revisiting the example above for expanding and collapsing the search input screen, we want:</p><ol class=""><li id="a460" class="nj nk jg kf b kg kh kk kl ko nl ks nm kw nn la no np nq nr ga">The background to blur</li><li id="7d4f" class="nj nk jg kf b kg ns kk nt ko nu ks nv kw nw la no np nq nr ga">The top bar and bottom bars to slide in</li><li id="70e5" class="nj nk jg kf b kg ns kk nt ko nu ks nv kw nw la no np nq nr ga">The home screen search bar to transition into the “where are you going?” card</li><li id="deb1" class="nj nk jg kf b kg ns kk nt ko nu ks nv kw nw la no np nq nr ga">The other two search cards to fade in while staying anchored relative to the “where are you going? card</li></ol><p id="8ef8" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Here’s how that animation is defined using the declarative transition definition syntax:</p><pre class="ld le lf lg gx nx ny nz ir oa ob ga">let transitionDefinition: TransitionDefinition = [  SearchInput.background: .blur,  SearchInput.topBar: .translateY(-40),  SearchInput.bottomBar: .edgeTranslation(.bottom),SearchInput.whereCard: .sharedElement,  SearchInput.whereCardContent: .crossfade,  SearchInput.searchInput: .crossfade,SearchInput.whenCard: .anchorTranslation(relativeTo: SearchInput.whereCard),  SearchInput.whoCard: .anchorTranslation(relativeTo: SearchInput.whereCard),]</pre><h1 id="1a97" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">How It Works</h1><p id="eaca" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">This declarative <em class="mv">transition definition </em>API is powerful and flexible, but it only tells half the story. To actually perform the animation, our framework provides a generic <em class="mv">UIViewControllerAnimatedTransitioning</em> implementation that takes the transition definition and orchestrates the transition animation. To explore how this implementation works, we’ll return to the simple “search” interaction.</p><p id="e2b1" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">First, the framework traverses the view hierarchy of both the <em class="mv">source</em> and <em class="mv">destination</em> screens to extract the <em class="mv">UIView</em> for each of the identifiers being animated. This determines whether or not a given identifier is present on each screen, and forms an <em class="mv">identifier hierarchy</em> (much like the view hierarchy of a screen).</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nh"><picture></picture></div></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn"><em class="ni">The “identifier hierarchy” of the source and destination screens</em></figcaption></figure><p id="3780" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">The identifier hierarchies of the <em class="mv">source</em> and <em class="mv">destination</em> are diffed to determine whether an individual component was added, removed, or present in both. If the view was added or removed, the framework will use the animation specified in the transition definition. If the view was present in both states, the framework instead performs a “shared element animation” where the component animates from its initial position to its final position while its content is updated. These shared elements are animated recursively–each component can provide its own identifier hierarchy of child elements, which is diffed and animated as well.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nh"><picture></picture></div></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn"><em class="ni">The final identifier hierarchy after diffing the source and destination screens</em></figcaption></figure><p id="f218" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">To actually perform these animations, we need a single <em class="mv">view hierarchy</em> that matches the structure of our <em class="mv">identifier hierarchy</em>. We can’t just combine the source and destination screens into a single view hierarchy by layering them on top of each other, because the ordering would be wrong. In this case, if we just placed the destination screen over the source screen then the source <em class="mv">Explore.searchNavigationBarPill</em> view would be below the destination <em class="mv">BottomSheet.backgroundView</em> element, which doesn’t match the identifier hierarchy.</p><p id="d00f" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Instead, we have to create a separate view hierarchy that matches the structure of the identifier hierarchy. This requires making copies of the components being animated and adding them to the UIKit transition container. Most <em class="mv">UIView</em>saren’t trivially copyable, so copies are typically made by “snapshotting” the view (rendering it as an image). We temporarily hide the “original view” while the animation is playing, so only the snapshot is visible.</p><p id="9370" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Once the framework has set up the transition container’s view hierarchy and determined the specific animation to use for each component, the animations just have to be applied and played. This is where the underlying imperative <em class="mv">UIView</em> animations are performed.</p><h1 id="0bf3" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Conclusion</h1><p id="7e22" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Like with <a class="au lb" rel="noopener" href="https://medium.com/airbnb-engineering/introducing-epoxy-for-ios-6bf062be1670">Epoxy</a> and other declarative systems, abstracting away the underlying complexity and providing a simple declarative interface makes it possible for engineers to focus on the <em class="mv">what</em> rather than the <em class="mv">how</em>. The declarative transition definition for these animations are only a few lines of code, which is by itself a <em class="mv">huge</em> improvement over any feasible imperative implementation. And since our declarative feature-building APIs have first-class support for UIKit <a class="au lb" href="https://developer.apple.com/documentation/uikit/uiviewcontrolleranimatedtransitioning?language=objc" rel="noopener ugc nofollow" target="_blank"><em class="mv">UIViewControllerAnimatedTransitioning</em></a> implementations, these declarative transitions can be integrated into existing features without making any architecture changes. This significantly accelerates feature development, making it easier than ever to create highly polished transitions, while also enabling long-term flexibility and maintainability.</p><p id="9a8d" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We have a packed roadmap ahead. One area of active work is improving interoperability with SwiftUI. This lets us seamlessly transition between UIKit and SwiftUI-based screens, which unlocks incremental adoption of SwiftUI in our app without having to sacrifice motion. We’re also exploring making similar frameworks available on web and Android. Our long-term goal here is to make it as easy as possible to translate our designer’s great ideas into actual shipping products, on all platforms.</p><p id="4c05" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Interested in working at Airbnb? Check out these open roles:</p><p id="49f5" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><a class="au lb" href="https://careers.airbnb.com/positions/4693375/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Wishlists</a></p><p id="9902" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><a class="au lb" href="https://careers.airbnb.com/positions/4665949/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Guests &amp; Hosts</a></p><p id="dfb1" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><a class="au lb" href="https://careers.airbnb.com/positions/4590099/" rel="noopener ugc nofollow" target="_blank">Staff Android Software Engineer, Guest</a></p><h1 id="facf" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Acknowledgments</h1><p id="1d25" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Many thanks to Eric Horacek and Matthew Cheok for their major contributions to Airbnb’s motion architecture and our declarative transition framework.</p></div><div class="o dx oh oi ii oj" role="separator"><div class="iz ja jb jc jd"><p id="634b" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><em class="mv">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/motion-engineering-at-scale-5ffabfc878</link>
      <guid>https://medium.com/airbnb-engineering/motion-engineering-at-scale-5ffabfc878</guid>
      <pubDate>Wed, 07 Dec 2022 19:00:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Announcing Lottie 4.0 for iOS]]></title>
      <description><![CDATA[<header class="pw-post-byline-header go gp gq gr gs gt gu gv gw gx l"><div class="o gy u"><div class="o"><div class="fj l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@calstephens98?source=post_page-----d4d226862a54--------------------------------"><div class="l do"><img alt="Cal Stephens" class="l ch fl gz ha fp" src="https://miro.medium.com/fit/c/96/96/0*IiSO0Uuw0yTYVQbE.jpg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bm b dm dn ga"><div class="hb o hc"><div><div class="ci" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@calstephens98?source=post_page-----d4d226862a54--------------------------------">Cal Stephens</a></div></div><div class="hd he hf hg hh d"></div></div><div class="o ao ht"><p class="pw-published-date bm b bn bo cn">Dec 6</p><div class="hu ci" aria-hidden="true">·</div><div class="pw-reading-time bm b bn bo cn">5 min read</div></div></div></div><div class="o ao"><div class="h k hv hw hx"><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ib o ao"></div><div class="ck ih"></div></div></div><div class="ii ij ik j i d"><div class="fj l"><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="iz ja jb jc jd"><div class=""><p id="0182" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">A new rendering engine with significant performance improvements powered by Core Animation</p><p id="6967" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">By: </strong><a class="au lb" href="https://www.linkedin.com/in/calstephens/" rel="noopener ugc nofollow" target="_blank">Cal Stephens</a></p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm lc"><picture></picture></div></div></figure><p id="c043" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><a class="au lb" href="https://airbnb.design/lottie/" rel="noopener ugc nofollow" target="_blank"><strong class="kf jh">Lottie</strong></a> is Airbnb’s <a class="au lb" href="https://airbnb.io/lottie/#/README" rel="noopener ugc nofollow" target="_blank">cross-platform</a>, <a class="au lb" href="https://github.com/airbnb/lottie-ios" rel="noopener ugc nofollow" target="_blank">open source</a> library for rendering vector motion graphics. We use Lottie extensively at Airbnb, and it also powers animations in thousands of other apps throughout the industry.</p><div class="ld le lf lg gx o hc"><figure class="lo lh lp lq lr ls lt paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><picture></picture></div></figure><figure class="lo lh lu lq lr ls lt paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><picture></picture></div></figure><figure class="lo lh lv lq lr ls lt paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><picture></picture></div><figcaption class="lw bl gn gl gm lx ly bm b bn bo cn lz do ma mb">Example Lottie animations included in Airbnb’s iOS app</figcaption></figure></div><p id="0939" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Today we’re releasing <strong class="kf jh">Lottie 4.0 </strong>for iOS. This major new release brings <strong class="kf jh">significant performance improvements</strong> to all Lottie animations, with a brand new rendering engine powered by Core Animation.</p><p id="c346" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Using Lottie at scale for many years, we’ve learned a lot about its performance characteristics in real-world use cases. We found that it was relatively common for Lottie animations to drop frames in some of our more complex screens. To understand why, we first have to take a look at how Lottie previously rendered animations.</p><p id="2f07" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Previous versions of Lottie played animations on the app’s main thread, effectively using a <a class="au lb" href="https://developer.apple.com/documentation/quartzcore/cadisplaylink?language=objc" rel="noopener ugc nofollow" target="_blank"><em class="mc">CADisplayLink</em></a>. Once per frame, Lottie would execute code on the main thread to advance the progress of the animation and re-render its content. This meant that animations would consume 5–20%+ of the CPU while playing, leaving fewer CPU cycles available for the rest of the app:</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm md"><picture></picture></div></div><figcaption class="lw bl gn gl gm lx ly bm b bn bo cn"><em class="me">Playing an animation with Lottie 3.5.0, using the original main thread rendering engine</em></figcaption></figure><p id="c207" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">This also meant that animations would not update when the main thread was busy. This could cause animations to drop frames or freeze entirely, which results in a poor user experience:</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm mf"><picture></picture></div></div><figcaption class="lw bl gn gl gm lx ly bm b bn bo cn"><em class="me">Lottie animations dropping frames when the main thread is overloaded</em></figcaption></figure><p id="ead0" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">These issues are inherent limitations of using a main-thread-bound rendering architecture.</p><p id="c0e5" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">On iOS, the most performant and power-efficient way to play animations is by using <a class="au lb" href="https://developer.apple.com/documentation/quartzcore?language=objc" rel="noopener ugc nofollow" target="_blank">Core Animation</a>. This system framework renders animations out-of-process with GPU hardware acceleration. Animation playback is managed by a separate system process called the “render server”. This means Core Animation-powered animations don’t contribute to the CPU utilization of the app process itself, and can continue even when its main thread is blocked or busy.</p><p id="c752" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Throughout 2022, we’ve been working on a new rendering engine implementation for Lottie built on top of Core Animation. For each of the layers in the animation JSON file, the new engine builds a <em class="mc">CALayer</em> and applies <em class="mc">CAAnimation</em>s with keyframes for the layer’s animated properties. Lottie passes these animation keyframes off to Core Animation, which takes care of actually rendering them on-screen and updating the animation each frame.</p><p id="0c3d" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">This new engine eliminates the CPU overhead from playing a Lottie animation, and effectively guarantees that Lottie animations will animate smoothly at 60 or 120 fps regardless of the app’s CPU load.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm mg"><picture></picture></div></div><figcaption class="lw bl gn gl gm lx ly bm b bn bo cn"><em class="me">Playing an animation with Lottie 4.0, using the new Core Animation rendering engine</em></figcaption></figure><p id="7e8a" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Since animations rendered by the new engine don’t execute any code on the app’s main thread, apps now have more resources available for other functionality. This is especially valuable when running tasks with high CPU load. As an example, the Airbnb app displays a Lottie animation when starting up for the first time. We ran an experiment here and found that switching to the new rendering engine <em class="mc">reduces</em> our app’s total launch time, while <em class="mc">also</em> improving the frame-rate and UX of the startup animation.</p><p id="9d74" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We <a class="au lb" href="https://github.com/airbnb/lottie-ios/discussions/1627" rel="noopener ugc nofollow" target="_blank">first introduced</a> the Core Animation rendering engine in Lottie 3.4.0 earlier this year, behind an opt-in feature flag. We’ve been using the new engine by default for all Lottie animations in the Airbnb app for over six months, and have been hard at work fixing issues reported by early-adopters in the community.</p><p id="9083" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Starting in today’s Lottie 4.0 release for iOS, the Core Animation rendering engine is <strong class="kf jh">enabled by default</strong> for all apps using Lottie, with no additional work or migration required by app developers. This is a major milestone that we’ve been working towards for a long time, and we hope it helps raise the bar for animation quality and performance even higher throughout the industry!</p><p id="8502" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Lottie 4.0 for iOS also includes several significant enhancements contributed by members of the community:</p><ul class=""><li id="1af5" class="mh mi jg kf b kg kh kk kl ko mj ks mk kw ml la mm mn mo mp ga">Support for <a class="au lb" href="https://dotlottie.io/" rel="noopener ugc nofollow" target="_blank">dotLottie animation files</a>, which are much smaller in size than standard JSON files</li><li id="8f53" class="mh mi jg kf b kg mq kk mr ko ms ks mt kw mu la mm mn mo mp ga">A new animation decoding implementation that is ~2x faster than the previous <em class="mc">Codable</em>-based implementation</li></ul><p id="9fa6" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">You can learn more about Lottie, and our commitment to open source, in previous posts we’ve published:</p><ul class=""><li id="95cc" class="mh mi jg kf b kg kh kk kl ko mj ks mk kw ml la mm mn mo mp ga"><a class="au lb" href="https://airbnb.design/introducing-lottie/" rel="noopener ugc nofollow" target="_blank">Introducing Lottie</a>: Behind the scenes of our new open-source animation tool</li><li id="80cf" class="mh mi jg kf b kg mq kk mr ko ms ks mt kw mu la mm mn mo mp ga"><a class="au lb" rel="noopener" href="https://medium.com/airbnb-engineering/lottie-and-swift-at-airbnb-e0c85dc365e7">Moving Lottie Swiftly into the Future</a>:A personal story on how Airbnb rewrote the popular open source library Lottie in a new language</li></ul><p id="b196" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">If this type of work interests you, check out some of our related positions:</p><p id="e642" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Interested in working at Airbnb? Check out these open roles:</p><p id="b137" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><a class="au lb" href="https://careers.airbnb.com/positions/4693375/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Wishlists</a></p><p id="bb29" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><a class="au lb" href="https://careers.airbnb.com/positions/4665949/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Guests &amp; Hosts</a></p><h1 id="f38f" class="mv mw jg bm mx my mz na nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns ga">Acknowledgments</h1><p id="b6ca" class="pw-post-body-paragraph kd ke jg kf b kg nt ki kj kk nu km kn ko nv kq kr ks nw ku kv kw nx ky kz la iz ga">Many thanks to Eric Horacek for first proposing this project and reviewing 100+ pull requests over the past year. Also thanks to Brandon Withrow, the original author of Lottie, plus the <a class="au lb" href="https://github.com/airbnb/lottie-ios/graphs/contributors" rel="noopener ugc nofollow" target="_blank">many other contributors</a> who have helped out over the years.</p></div><div class="o dx ny nz ii oa" role="separator"><div class="iz ja jb jc jd"><p id="b8ba" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><em class="mc">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/announcing-lottie-4-0-for-ios-d4d226862a54</link>
      <guid>https://medium.com/airbnb-engineering/announcing-lottie-4-0-for-ios-d4d226862a54</guid>
      <pubDate>Tue, 06 Dec 2022 21:01:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[How AI Text Generation Models Are Reshaping Customer Support at Airbnb]]></title>
      <description><![CDATA[<header class="pw-post-byline-header go gp gq gr gs gt gu gv gw gx l"><div class="o gy u"><div class="o"><div class="fj l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@lyo.gavin?source=post_page-----a851db0b4fa3--------------------------------"><div class="l do"><img alt="Gavin Li" class="l ch fl gz ha fp" src="https://miro.medium.com/fit/c/96/96/0*FeeK9pylAmmkY3_4." width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bm b dm dn ga"><div class="hb o hc"><div><div class="ci" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@lyo.gavin?source=post_page-----a851db0b4fa3--------------------------------">Gavin Li</a></div></div><div class="hd he hf hg hh d"></div></div><div class="o ao ht"><p class="pw-published-date bm b bn bo cn">Nov 23</p><div class="hu ci" aria-hidden="true">·</div><div class="pw-reading-time bm b bn bo cn">10 min read</div></div></div></div><div class="o ao"><div class="h k hv hw hx"><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ib o ao"></div><div class="ck ih"></div></div></div><div class="ii ij ik j i d"><div class="fj l"><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="iz ja jb jc jd"><div class=""><p id="de65" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Leveraging text generation models to build more effective, scalable customer support products.</strong></p><figure class="lc ld le lf gx lg gl gm paragraph-image"><div role="button" tabindex="0" class="lh li do lj ce lk"><div class="gl gm lb"><picture></picture></div></div></figure><p id="0dae" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><a class="au ln" href="https://www.linkedin.com/in/gavin-li-64354117/" rel="noopener ugc nofollow" target="_blank">Gavin Li</a>, <a class="au ln" href="https://www.linkedin.com/in/mia-zhao-964a9213/" rel="noopener ugc nofollow" target="_blank">Mia Zhao</a> and <a class="au ln" href="https://www.linkedin.com/in/zhenyu-zhao-30b8632a/" rel="noopener ugc nofollow" target="_blank">Zhenyu Zhao</a></p><p id="e57f" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">One of the fastest-growing areas in modern Artificial Intelligence (AI) is <a class="au ln" href="https://huggingface.co/tasks/text-generation" rel="noopener ugc nofollow" target="_blank">AI text generation models</a>. As the name suggests, these models generate natural language. Previously, most industrial natural language processing (NLP) models were classifiers, or what might be called discriminative models in machine learning (ML) literature. However, in recent years, generative models based on large-scale language models are rapidly gaining traction and fundamentally changing how ML problems are formulated. Generative models can now obtain some domain knowledge through large-scale pre-training and then produce high-quality text — for instance answering questions or paraphrasing a piece of content.</p><p id="3eee" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">At Airbnb, we’ve heavily invested in AI text generation models in our community support (CS) products, which has enabled many new capabilities and use cases. This article will discuss three of these use cases in detail. However, first let’s talk about some of the beneficial traits of text generation models that make it a good fit for our products.</p><h1 id="d519" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">About Text Generation Models</h1><p id="b9ac" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Applying AI models in large-scale industrial applications like Airbnb customer support is not an easy challenge. Real-life applications have many long-tail corner cases, can be hard to scale, and often become costly to label the training data. There are several traits of text generation models that address these challenges and make this option particularly valuable.</p><h1 id="b026" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Encoding Knowledge</h1><p id="2860" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">The first attractive trait is the capability to encode domain knowledge into the language models. As illustrated by <a class="au ln" href="https://arxiv.org/abs/1909.01066" rel="noopener ugc nofollow" target="_blank">Petroni et al. (2019)</a>, we can encode domain knowledge through large-scale pre-training and transfer learning. In traditional ML paradigms, input matters a lot. The model is just a transformation function from the input to the output. The model training focuses mainly on preparing input, feature engineering, and training labels. While for generative models, the key is the knowledge encoding. How well we can design the pre-training and training to encode high-quality knowledge into the model — and how well we design prompts to induce this knowledge — is far more critical. This fundamentally changes how we solve traditional problems like classifications, rankings, candidate generations, etc.</p><p id="d98a" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Over the past several years, we have accumulated massive amounts of records of our human agents offering help to our guests and hosts at Airbnb. We’ve then used this data to design large-scale pre-training and training to encode knowledge about solving users’ travel problems. At inference time, we’ve designed prompt input to generate answers based directly on the encoded human knowledge. This approach produced significantly better results compared to traditional classification paradigms. A/B testing showed significant business metric improvement as well as significantly better user experience.</p><h1 id="02d1" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Unsupervised Learning</h1><p id="af7b" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">The second trait of the text generation model we’ve found attractive is its “unsupervised” nature. Large-scale industrial use cases like Airbnb often have large amounts of user data. How to mine helpful information and knowledge to train models becomes a challenge. First, labeling large amounts of data by human effort is very costly, significantly limiting the training data scale we could use. Second, designing good labeling guidelines and a comprehensive label taxonomy of user issues and intents is challenging because real-life problems often have long-tail distribution and lots of nuanced corner cases. It doesn’t scale to rely on human effort to exhaust all the possible user intent definitions.</p><p id="a99f" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">The unsupervised nature of the text generation model allows us to train models without largely labeling the data. In the pre-training, in order to learn how to predict the target labels, the model is forced to first gain a certain understanding about the problem taxonomy. Essentially the model is doing some data labeling design for us internally and implicitly. This solves the scalability issues when it comes to intent taxonomy design and cost of labeling, and therefore opens up many new opportunities. We’ll see some examples of this when we dive into use cases later in this post.</p><h1 id="d273" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">More Natural and Productive Language Models</h1><p id="1bde" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Finally, text generation models transcend the traditional boundaries of ML problem formulations Over the past few years, researchers have realized that the extra dense layers in autoencoding models may be unnatural, counterproductive, and restrictive. In fact, all of the typical machine learning tasks and problem formulations can be viewed as different manifestations of the single, unifying problem of language modeling. A classification can be formatted as a type of language model where the output text is the literal string representation of the classes.</p><p id="9ecc" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In order to make the language model unification effective, a new but essential role is introduced: the <strong class="kf jh">prompt</strong>. A prompt is a short piece of textual instruction that informs the model of the task at hand and sets the expectation for what the format and content of the output should be. Along with the prompt, additional natural language annotations, or hints, are also highly beneficial in further contextualizing the ML problem as a language generation task. The incorporation of prompts has been demonstrated to significantly improve the quality of language models on a variety of tasks. The figure below illustrates the anatomy of a high-quality input text for universal generative modeling.</p><figure class="lc ld le lf gx lg gl gm paragraph-image"><div role="button" tabindex="0" class="lh li do lj ce lk"><div class="gl gm mr"><picture></picture></div></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Figure 1.1 An example of the prompt and input feature design of our text generation model</figcaption></figure><p id="6e2c" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Now, let’s dive into a few ways that text generation models have been applied within Airbnb’s Community Support products. We’ll explore three use cases — content recommendation, real-time agent assistance, and chatbot paraphrasing.</p><h1 id="478c" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Content Recommendation Model</h1><p id="32a0" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Our content recommendation workflow, powering both Airbnb’s Help Center search and the support content recommendation in our <a class="au ln" rel="noopener" href="https://medium.com/airbnb-engineering/using-chatbots-to-provide-faster-covid-19-community-support-567c97c5c1c9">Helpbot</a>, utilizes pointwise ranking to determine the order of the documents users receive, as shown in Figure 2.1. This pointwise ranker takes the textual representation of two pieces of input — the current user’s issue description and the candidate document, in the form of its title, summary, and keywords. It then computes a relevance score between the description and the document, which is used for ranking. Prior to 2022, this pointwise ranker had been implemented using the XLMRoBERTa, however we’ll see shortly why we’ve switched to the MT5 model.</p><figure class="lc ld le lf gx lg gl gm paragraph-image"><div role="button" tabindex="0" class="lh li do lj ce lk"><div class="gl gm mv"><picture></picture></div></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Figure 2.1 How we utilized encoder-only architecture with an arbitrary classification head to perform pointwise document ranking</figcaption></figure><p id="369b" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Following the design decision to introduce prompts, we transformed the classic binary classification problem into a prompt-based language generation problem. The input is still derived from both the issue description and the candidate document’s textual representation. However, we contextualize the input by prepending a prompt to the description that informs the model that we expect a binary answer, either “Yes” or “No”, of whether the document would be helpful in resolving the issue. We also added annotations to provide extra hints to the intended roles of the various parts of the input text, as illustrated in the figure below. To enable personalization, we expanded the issue description input with textual representations of the user and their reservation information.</p><figure class="lc ld le lf gx lg gl gm paragraph-image"><div role="button" tabindex="0" class="lh li do lj ce lk"><div class="gl gm mr"><picture></picture></div></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Figure 2.2. How we leveraged an encoder-decoder architecture with a natural language output to serve as a pointwise ranker</figcaption></figure><p id="9052" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We fine-tuned the MT5 model on the task described above. In order to evaluate the quality of the generative classifier, we used production traffic data sampled from the same distribution as the training data. The generative model demonstrated significant improvements in the key performance metric for support document ranking, as illustrated in the table below.</p><figure class="lc ld le lf gx lg"><div class="m fs l do"><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Table 2.1 Airbnb Support Content Recommendation</figcaption></div></figure><p id="c8af" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In addition, we also tested the generative model in an online A/B experiment, integrating the model into Airbnb’s Help Center, which has millions of active users. The successful experimentation results led to the same conclusion — the generative model recommends documents with significantly higher relevance in comparison with the classification-based baseline model.</p><h1 id="1aaf" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">‘Real-Time Agent Assistant’ Model</h1><p id="91f4" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Equipping agents with the right contextual knowledge and powerful tools leads to better experiences for our customers. So we provide our agents with just-in-time guidance, which directs them to the correct answers consistently and helps them resolve user issues efficiently.</p><p id="b3cf" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">For example, through agent-user conversations, suggested templates are displayed to assist agents in problem solving. To make sure our suggestions are enforced within CS policy, suggestion templates are gated by a combination of API checks and model intent checks. This model needs to answer questions to capture user intents such as:</p><ul class=""><li id="d84f" class="my mz jg kf b kg kh kk kl ko na ks nb kw nc la nd ne nf ng ga">Is this message about a cancellation?</li><li id="8f75" class="my mz jg kf b kg nh kk ni ko nj ks nk kw nl la nd ne nf ng ga">What cancellation reason did this user mention?</li><li id="20e8" class="my mz jg kf b kg nh kk ni ko nj ks nk kw nl la nd ne nf ng ga">Is this user canceling due to a COVID sickness?</li><li id="559d" class="my mz jg kf b kg nh kk ni ko nj ks nk kw nl la nd ne nf ng ga">Did this user accidentally book a reservation?</li></ul><figure class="lc ld le lf gx lg gl gm paragraph-image"><div role="button" tabindex="0" class="lh li do lj ce lk"><div class="gl gm nm"><picture></picture></div></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Figure 3.1 AI-generated recommendation template</figcaption></figure><p id="deef" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In order to support many granular intent checks, we developed a mastermind Question-Answering (QA) model, aiming to help answer all related questions. This QA model was developed using the generative model architecture mentioned above. We concatenate multiple rounds of user-agent conversations to leverage chat history as input text and then ask the prompt we care about at the point in time of serving.</p><p id="36f4" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Prompts are naturally aligned with the same questions we ask humans to annotate. Slightly different prompts would result in different answers as shown below. Based on the model’s answer, relevant templates are then recommended to agents.</p><figure class="lc ld le lf gx lg"><div class="m fs l do"><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Table 3.1 Prompt design for mastermind QA model</figcaption></div></figure><figure class="lc ld le lf gx lg gl gm paragraph-image"><div role="button" tabindex="0" class="lh li do lj ce lk"><div class="gl gm no"><picture></picture></div></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Figure 2.2 Mastermind QA model architecture</figcaption></figure><p id="e7ba" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We leveraged backbone models such as t5-base and Narrativa and did experimentations on various training dataset compositions including annotation-based data and logging-based data with additional post-processing. Annotation datasets usually have higher precision, lower coverage, and more consistent noise, while logging datasets have lower precision, higher case coverage, and more random noises. We found that combining these two datasets together yielded the best performance.</p><figure class="lc ld le lf gx lg"><div class="m fs l do"><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Table 3.2 Experiment results for mastermind QA model</figcaption></div></figure><p id="7154" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Due to the large size of the parameters, we leverage a library, called <a class="au ln" href="https://github.com/microsoft/DeepSpeed" rel="noopener ugc nofollow" target="_blank">DeepSpeed</a>, to train the generative model using multi GPU cores. DeepSpeed helps to speed up the training process from weeks to days. That being said, it typically requires longer for hyperparameter tunings. Therefore, experiments are required with smaller datasets to get a better direction on parameter settings. In production, online testing with real CS ambassadors showed a large engagement rate improvement.</p><h1 id="de8d" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Paraphrase Model in Chatbot</h1><p id="9369" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Accurate intent detection, slot filling, and effective solutions are not sufficient for building a successful AI chatbot. Users often choose not to engage with the chatbot, no matter how good the ML model is. Users want to solve problems quickly, so they are constantly trying to assess if the bot is understanding their problem and if it will resolve the issue faster than a human agent. Building a paraphrase model, which first rephrases the problem a user describes, can give users some confidence and confirm that the bot’s understanding is correct. This has significantly improved our bot’s engagement rate. Below is an example of our chatbot automatically paraphrasing the user’s description.</p><figure class="lc ld le lf gx lg gl gm paragraph-image"><div role="button" tabindex="0" class="lh li do lj ce lk"><div class="gl gm nq"><picture></picture></div></div><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Figure 4.1 An actual example of the chatbot paraphrasing a user’s description of a payment issue</figcaption></figure><p id="612b" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">This method of paraphrasing a user’s problem is used often by human customer support agents. The most common pattern of this is “I understand that you…”. For example, if the user asks if they can cancel the reservation for free, the agent will reply with, “I understand that you want to cancel and would like to know if we can refund the payment in full.” We built a simple template to extract all the conversations where an agent’s reply starts with that key phrase. Because we have many years of agent-user communication data, this simple heuristic gives us millions of training labels for free.</p><p id="27a8" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We tested popular sequence-to-sequence transformer model backbones like <a class="au ln" href="https://arxiv.org/abs/1910.13461" rel="noopener ugc nofollow" target="_blank">BART</a>, <a class="au ln" href="https://doi.org/10.48550/ARXIV.1912.08777" rel="noopener ugc nofollow" target="_blank">PEGASUS</a>, <a class="au ln" href="http://arxiv.org/abs/1910.10683" rel="noopener ugc nofollow" target="_blank">T5</a>, etc, and autoregressive models like <a class="au ln" href="https://doi.org/10.48550/ARXIV.1907.05774" rel="noopener ugc nofollow" target="_blank">GPT2</a>, etc. For our use case, the T5 model produced the best performance.</p><p id="7359" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">As found by <a class="au ln" href="https://arxiv.org/abs/1905.05709" rel="noopener ugc nofollow" target="_blank">Huang et al. (2020)</a>, one of the most common issues of the text generation model is that it tends to generate bland, generic, uninformative replies. This was also the major challenge we faced.</p><p id="900b" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">For example, the model outputs the same reply for many different inputs: “I understand that you have some issues with your reservation.” Though correct, this is too generic to be useful.</p><p id="99cc" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We tried several different solutions. First, we tried to build a backward model to predict <em class="nr">P(Source|target)</em>, as introduced by <a class="au ln" href="https://arxiv.org/abs/1911.00536" rel="noopener ugc nofollow" target="_blank">Zhang et al. (2020)</a>, and use it as a reranking model to filter out results that were too generic. Second, we tried to use some rule-based or model-based filters.</p><p id="2d65" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In the end, we found the best solution was to tune the training data. To do this, we ran text clustering on the training target data based on pre-trained similarity models from <a class="au ln" href="https://www.sbert.net/" rel="noopener ugc nofollow" target="_blank">Sentence-Transformers</a>. As seen in the table below, the training data contained too many generic meaningless replies, which caused the model to do the same in its output.</p><figure class="lc ld le lf gx lg"><div class="m fs l do"><figcaption class="ms bl gn gl gm mt mu bm b bn bo cn">Table 4.2 Top clusters in the training labels</figcaption></div></figure><p id="0879" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We labeled all clusters that are too generic and used Sentence-Transformers to filter them out from the training data. This approach worked significantly better and gave us a high-quality model to put into production.</p><h1 id="a082" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Conclusion</h1><p id="a2d2" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">With the fast growth of large-scale pre-training-based transformer models, the text generation models can now encode domain knowledge. This not only allows them to utilize the application data better, but allows us to train models in an unsupervised way that helps scale data labeling. This enables many innovative ways to tackle common challenges in building AI products. As demonstrated in the three use cases detailed in this post — content ranking, real-time agent assistance, and chatbot paraphrasing — the text generation models improve our user experiences effectively in customer support scenarios. We believe that text generation models are a crucial new direction in the NLP domain. They help Airbnb’s guests and hosts solve their issues more swiftly and assist Support Ambassadors in achieving better efficiency and a higher resolution of the issues at hand. We look forward to continuing to invest actively in this area.</p><h1 id="1abb" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Acknowledgments</h1><p id="c1ad" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Thank you <a class="au ln" href="https://www.linkedin.com/in/weipingpeng/" rel="noopener ugc nofollow" target="_blank">Weiping Pen</a>, <a class="au ln" href="https://www.linkedin.com/in/xin-liu-908b6b18/" rel="noopener ugc nofollow" target="_blank">Xin Liu</a>, <a class="au ln" href="https://www.linkedin.com/in/mukundn/" rel="noopener ugc nofollow" target="_blank">Mukund Narasimhan</a>, <a class="au ln" href="https://www.linkedin.com/in/cmujoy/" rel="noopener ugc nofollow" target="_blank">Joy Zhang</a>, <a class="au ln" href="https://www.linkedin.com/in/tina-su-saratoga/" rel="noopener ugc nofollow" target="_blank">Tina Su</a>, <a class="au ln" href="https://www.linkedin.com/in/ayasutake/" rel="noopener ugc nofollow" target="_blank">Andy Yasutake</a> for reviewing and polishing the blog post content and all the great suggestions. Thank you <a class="au ln" href="https://www.linkedin.com/in/cmujoy/" rel="noopener ugc nofollow" target="_blank">Joy Zhang</a>, <a class="au ln" href="https://www.linkedin.com/in/tina-su-saratoga/" rel="noopener ugc nofollow" target="_blank">Tina Su</a>, <a class="au ln" href="https://www.linkedin.com/in/ayasutake/" rel="noopener ugc nofollow" target="_blank">Andy Yasutake</a> for their leadership support! Thank you <a class="au ln" href="https://www.linkedin.com/in/elaineliu5/" rel="noopener ugc nofollow" target="_blank">Elaine Liu</a> for building the paraphrase end-to-end product, running the experiments, and launching. Thank you to our close PM partners, <a class="au ln" href="https://www.linkedin.com/in/shuangyi-cassie-cao/" rel="noopener ugc nofollow" target="_blank">Cassie Cao</a> and <a class="au ln" href="https://www.linkedin.com/in/jerryhong/" rel="noopener ugc nofollow" target="_blank">Jerry Hong</a>, for their PM expertise. This work could not have happened without their efforts.</p><p id="686f" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><em class="nr">Interested in working at Airbnb? Check out </em><a class="au ln" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank"><em class="nr">these</em></a><em class="nr"> open roles.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/how-ai-text-generation-models-are-reshaping-customer-support-at-airbnb-a851db0b4fa3</link>
      <guid>https://medium.com/airbnb-engineering/how-ai-text-generation-models-are-reshaping-customer-support-at-airbnb-a851db0b4fa3</guid>
      <pubDate>Wed, 23 Nov 2022 18:38:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Building Airbnb Categories with ML and Human-in-the-Loop]]></title>
      <description><![CDATA[<header class="pw-post-byline-header go gp gq gr gs gt gu gv gw gx l"><div class="o gy u"><div class="o"><div class="fj l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@mihajlo.grbovic?source=post_page-----e97988e70ebb--------------------------------"><div class="l do"><img alt="Mihajlo Grbovic" class="l ch fl gz ha fp" src="https://miro.medium.com/fit/c/96/96/1*hdBvFFL4w9pznHLwQUJjdw.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bm b dm dn ga"><div class="hb o hc"><div><div class="ci" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@mihajlo.grbovic?source=post_page-----e97988e70ebb--------------------------------">Mihajlo Grbovic</a></div></div><div class="hd he hf hg hh d"></div></div><div class="o ao ht"><p class="pw-published-date bm b bn bo cn">Nov 21</p><div class="hu ci" aria-hidden="true">·</div><div class="pw-reading-time bm b bn bo cn">9 min read</div></div></div></div><div class="o ao"><div class="h k hv hw hx"><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ib o ao"></div><div class="ck ih"></div></div></div><div class="ii ij ik j i d"><div class="fj l"><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="iz ja jb jc jd"><div class=""><div class=""><h2 id="6c66" class="pw-subtitle-paragraph kd jf jg bm b ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku cn"><strong class="ba">Airbnb Categories Blog Series — Part I</strong></h2></div><p id="ae81" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">By: <strong class="kx jh">Mihajlo Grbovic, Ying Xiao, Pratiksha Kadam, Aaron Yin, Pei Xiong, Dillon Davis, Aditya Mukherji, Kedar Bellare, Haowei Zhang, Shukun Yang, Chen Qian, Sebastien Dubois, Nate Ney, James Furnary, Mark Giangreco, Nate Rosenthal, Cole Baker, Bill Ulammandakh, Sid Reddy, Egor Pakhomov</strong></p><figure class="ls lt lu lv gx lw gl gm paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><div class="gl gm lr"><picture></picture></div></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn">Figure 1. Browsing listings by categories: <strong class="bm mg">Castles</strong>, <strong class="bm mg">Desert</strong>, <strong class="bm mg">Design</strong>, <strong class="bm mg">Beach </strong>&amp;<strong class="bm mg"> Countryside</strong></figcaption></figure><h1 id="3b69" class="mh mi jg bm mg mj mk ml mm mn mo mp mq km mr kn ms kp mt kq mu ks mv kt mw mx ga">25 Years of Online Travel Search</h1><p id="0af1" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga">Online travel search hasn’t changed much in the last 25 years. The traveler enters her destination, dates, and the number of guests into a search interface, which dutifully returns a list of options that best meet the criteria. Eventually, Airbnb and other travel sites made improvements to allow for better filtering, ranking, personalization and, more recently, to display results slightly outside of the specified search parameters–for example, by accommodating flexible dates or by suggesting nearby locations. Taking a page from the travel agency model, these websites also built more “inspirational” browsing experiences that recommend popular destinations, showcasing these destinations with captivating imagery and inventory (think digital “catalog”).</p><figure class="ls lt lu lv gx lw gl gm paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><div class="gl gm nd"><picture></picture></div></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn">Figure 2. Airbnb Destination Recommendation Example</figcaption></figure><p id="2ffd" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">The biggest shortcoming of these approaches is that the traveler must have a specific destination in mind. Even travelers who are flexible get funneled to a similar set of well-known destinations, reinforcing the cycle of mass tourism.</p><h1 id="2ad5" class="mh mi jg bm mg mj mk ml mm mn mo mp mq km mr kn ms kp mt kq mu ks mv kt mw mx ga">Introducing Airbnb Categories</h1><p id="e26e" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga">In our recent release, we flipped the travel search experience on its head by having the inventory dictate the destinations, not the other way around. In this way, we sought to inspire the traveler to book unique stays in places they might not think to search for. By leading with our unique places to stay, grouped together into cohesive “categories”, we inspired our guests to find some incredible places to stay off the beaten path.</p><figure class="ls lt lu lv gx lw gl gm paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><div class="gl gm ne"><picture></picture></div></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn">Figure 3. Unique travel worthy inventory in lesser known destinations that users are unlikely to search for</figcaption></figure><p id="d601" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">Though our goal was an intuitive browsing experience, it required considerable work behind the scenes to pull this off. In this three-part series, we will pull back the curtain on the technical aspects of the <a class="au nf" href="https://news.airbnb.com/2022-summer-release/" rel="noopener ugc nofollow" target="_blank">Airbnb 2022 Summer Launch</a>.</p><ul class=""><li id="68c2" class="ng nh jg kx b ky kz lb lc le ni li nj lm nk lq nl nm nn no ga"><strong class="kx jh">Part I </strong>(this post) is designed to be a high-level introductory post about how we applied machine learning to build out the listing collections and to solve different tasks related to the browsing experience–specifically, quality estimation, photo selection and ranking.</li><li id="f650" class="ng nh jg kx b ky np lb nq le nr li ns lm nt lq nl nm nn no ga"><strong class="kx jh">Part II </strong>of the series focuses on ML Categorization of listings into categories. It explains the approach in more detail, including signals and labels that we used, tradeoffs we made, and how we set up a human-in-the-loop feedback system.</li><li id="56f4" class="ng nh jg kx b ky np lb nq le nr li ns lm nt lq nl nm nn no ga"><strong class="kx jh">Part III</strong> focuses on ML Ranking of Categories depending on the search query. For example, we taught the model to show the Skiing category first for an Aspen, Colorado query versus Beach/Surfing for a Los Angeles query. That post will also cover our approach for ML Ranking of listings within each category.</li></ul><h1 id="dfb3" class="mh mi jg bm mg mj mk ml mm mn mo mp mq km mr kn ms kp mt kq mu ks mv kt mw mx ga">Grouping Listings into Categories</h1><p id="12d6" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga">Airbnb has thousands of very unique, high quality listings, many of which received design and architecture awards or have been featured in travel magazines or movies. However, these listings are sometimes hard to discover because they are in a little-known town or because they are not ranked highly enough by the search algorithm, which optimizes for bookings. While these unique listings may not always be as bookable as others due to lower availability or higher price, they are great for inspiration and for helping guests discover hidden destinations where they may end up booking a stay influenced by the category.</p><p id="4f24" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">To showcase these special listings we decided to group them into collections of homes organized by what makes them unique. The result was <strong class="kx jh">Airbnb Categories, </strong>collections of homes revolving around some common themes including the following:</p><ul class=""><li id="a667" class="ng nh jg kx b ky kz lb lc le ni li nj lm nk lq nl nm nn no ga"><strong class="kx jh">Categories that revolve around a location or a place of interest (POI)</strong> such as Coastal, Lake, National Parks, Countryside, Tropical, Arctic, Desert, Islands, etc.</li><li id="41b2" class="ng nh jg kx b ky np lb nq le nr li ns lm nt lq nl nm nn no ga"><strong class="kx jh">Categories that revolve around an activity</strong> such as Skiing, Surfing, Golfing, Camping, Wine tasting, Scuba, etc.</li><li id="b479" class="ng nh jg kx b ky np lb nq le nr li ns lm nt lq nl nm nn no ga"><strong class="kx jh">Categories that revolve around a home type</strong> such as Barns, Castles, Windmills, Houseboats, Cabins, Caves, Historical, etc.</li><li id="1352" class="ng nh jg kx b ky np lb nq le nr li ns lm nt lq nl nm nn no ga"><strong class="kx jh">Categories that revolve around a home amenity</strong> such as Amazing Pools, Chef’s Kitchen, Grand Pianos, Creative Spaces, etc.</li></ul><p id="ac5e" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">We defined 56 categories and outlined the definition for each category. Now all that was left to do was to assign our entire catalog of listings to categories.</p><p id="7b87" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">With the Summer launch just a few months away, we knew that we could not manually curate all the categories, as it would be very time consuming and costly. We also knew that we could not generate all the categories in a rule-based manner, as this approach would not be accurate enough. Finally, we knew we could not produce an accurate ML categorization model without a training set of human-generated labels. Given all of these limitations, we decided to combine the accuracy of human review with the scale of ML models to create a human-in-the-loop system for listing categorization and display.</p><h2 id="2dc9" class="nu mi jg bm mg nv nw nx mm ny nz oa mq le ob oc ms li od oe mu lm of og mw oh ga">Rule-Based Candidate Generation</h2><p id="b040" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga">Before we could build a trained ML model for assigning listings to categories, we had to rely on various listing- and geo-based signals to generate the initial set of candidates. We named this technique <strong class="kx jh"><em class="oi">weighted sum of indicators</em></strong><em class="oi">. </em>It consists of building out a set of signals (indicators) that associate a listing with a specific category. The more indicators the listing has, the better the chances of it belonging to that category.</p><figure class="ls lt lu lv gx lw gl gm paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><div class="gl gm oj"><picture></picture></div></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn">Figure 4. Rule-based weighted sum of indicators approach to produce candidates for human review</figcaption></figure><p id="e6bb" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">For example, let’s consider a listing that is within 100 meters of a Lake POI, with keyword “lakefront” mentioned in listing title and guest reviews, lake views appearing in listing photos and several kayaking activities nearby. All this information together strongly indicates that the listing belongs to the <em class="oi">Lakefront</em> category. The weighted sum of these indicators totals to a high <strong class="kx jh"><em class="oi">score</em></strong>, which means that this listing-category pair would be a strong candidate for human review. If a rule-based candidate generation created a large set of candidates we would use this score to prioritize listings for human review to maximize the initial yield.</p><h2 id="eb18" class="nu mi jg bm mg nv nw nx mm ny nz oa mq le ob oc ms li od oe mu lm of og mw oh ga">Human Review</h2><p id="f74c" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga">The manual review of candidates consists of several tasks. Given a listing candidate for a particular category or several categories, an agent would:</p><ul class=""><li id="116a" class="ng nh jg kx b ky kz lb lc le ni li nj lm nk lq nl nm nn no ga"><strong class="kx jh">Confirm/reject the category or categories</strong> assigned to the listing by comparing it to the category definition.</li><li id="833a" class="ng nh jg kx b ky np lb nq le nr li ns lm nt lq nl nm nn no ga"><strong class="kx jh">Pick the photo</strong> that best represents the category. Listings can belong to multiple categories, so it is sometimes appropriate to pick a different photo to serve as the cover image for different categories.</li><li id="42df" class="ng nh jg kx b ky np lb nq le nr li ns lm nt lq nl nm nn no ga"><strong class="kx jh">Determine the quality tier </strong>of the selected photo. Specifically, we defined <strong class="kx jh">four quality tiers:</strong> <strong class="kx jh"><em class="oi">Most Inspiring</em></strong>, <strong class="kx jh"><em class="oi">High Quality</em></strong>, <strong class="kx jh"><em class="oi">Acceptable Quality</em></strong>, and <strong class="kx jh"><em class="oi">Low Quality. </em></strong>We use this information to rank the higher quality listings near the top of the results to achieve the “wow” effect with prospective guests.</li><li id="3a8e" class="ng nh jg kx b ky np lb nq le nr li ns lm nt lq nl nm nn no ga">Some of the categories rely on signals related to <strong class="kx jh">Places of Interest (POIs) data</strong> such as the locations of lakes or national parks, so the reviewers could add a POI that we were missing in our database.</li></ul><h2 id="a6a8" class="nu mi jg bm mg nv nw nx mm ny nz oa mq le ob oc ms li od oe mu lm of og mw oh ga">Candidate Expansion</h2><p id="b24c" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga">Although the rule-based approach can generate many candidates for some categories, for others (e.g., Creative Spaces, Amazing Views) it may produce only a limited set of listings. In those cases, we turn to candidate expansion. One such technique leverages pre-trained listing embeddings. Once a human reviewer confirms that a listing belongs to a particular category, we can find similar listings via cosine similarity. Very often the 10 nearest neighbors are good candidates for the same category and can be sent for human review. We detailed one of the embedding approaches in our previous<a class="au nf" rel="noopener" href="https://medium.com/airbnb-engineering/listing-embeddings-for-similar-listing-recommendations-and-real-time-personalization-in-search-601172f7603e"> blog post</a> and have developed new ones since then.</p><figure class="ls lt lu lv gx lw gl gm paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><div class="gl gm ok"><picture></picture></div></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn">Figure 5. Listing similarity via embeddings can help find more listings that are from the same category</figcaption></figure><p id="0c70" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">Other expansion techniques include keyword expansions, POI data expansions, etc.</p><h2 id="c14f" class="nu mi jg bm mg nv nw nx mm ny nz oa mq le ob oc ms li od oe mu lm of og mw oh ga">Training ML Models</h2><p id="7098" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga">Once we collected enough human-generated labels, we trained a binary classification model that predicts whether or not a listing belongs to a specific category. We then used a holdout set to evaluate performance of the model using a precision-recall (PR) curve. Our goal here was to evaluate if the model was good enough to send highly confident listings directly to production.</p><p id="f3b2" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">Figure 6 shows a trained ML model for the Lakefront category. On the left we can see the feature importance graph, indicating which signals contribute most to the decision of whether or not a listing belongs to the Lakefront category. On the right we can see the hold out set PR curve of different model versions.</p><div class="ls lt lu lv gx o hc"><figure class="ol lw om on oo op oq paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><picture></picture></div></figure><figure class="ol lw or on oo op oq paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><picture></picture></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn os do ot ou">Figure 6. Lakefront ML model feature importance and performance evaluation</figcaption></figure></div><p id="5e98" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga"><strong class="kx jh">Sending confident listings to production: </strong>using a PR curve we can set a threshold that achieves 90% precision on a downsampled hold out set that mimics the true listing distribution. Then we can score all unlabeled listings and send ones above that threshold to production, with the expectation of 90% accuracy. In this particular case, we can achieve 76% recall at 90% precision, meaning that with this technique we can expect to capture 76% of the true Lakefront listings in production.</p><figure class="ls lt lu lv gx lw gl gm paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><div class="gl gm nd"><picture></picture></div></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn">Figure 7. Basic ML + Human in the Loop setup for tagging listings with categories</figcaption></figure><p id="f136" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga"><strong class="kx jh">Selecting listings for human review: </strong>given the expectation of 76% recall, to cover the rest of the Lakefront listings we also need to send listings below the threshold for human evaluation. When prioritizing the below-threshold listings, we considered the photo quality score for the listing and the current coverage of the category to which the listing was tagged, among other factors. Once a human reviewer confirmed a listing’s category assignment, that tag would be made available to production. Concurrently, we send the tags back to our ML models for retraining, so that the models improve over time.</p><p id="6b5b" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga"><strong class="kx jh">ML models for quality estimation and photo selection. </strong>In addition to the ML Categorization models described above, we also trained a Quality ML model that assigns one of the four quality tiers to the listing, as well as a Vision Transformer Cover Image ML model that chooses the listing photo that best represents the category. In the current implementation the Cover Image ML model takes the category information as the input signal, while the Quality ML model is a global model for all categories. The three ML models work together to assign category, quality and cover photo. Listings with these assigned attributes are sent directly into production under certain circumstances and also queued for review.</p><figure class="ls lt lu lv gx lw gl gm paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><div class="gl gm nd"><picture></picture></div></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn">Figure 8. Human vs. ML flow to production</figcaption></figure><h2 id="d67f" class="nu mi jg bm mg nv nw nx mm ny nz oa mq le ob oc ms li od oe mu lm of og mw oh ga">Two New Ranking Algorithms</h2><p id="bd12" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga"><a class="au nf" href="https://news.airbnb.com/2022-summer-release/" rel="noopener ugc nofollow" target="_blank">The Airbnb Summer release</a> introduced categories both to homepage (Figure 9 left), where we show categories that are popular near you, and to location searches (Figure 9 right), where we show categories that are related to the searched destination. For example, in the case of a Lake Tahoe location search we show <em class="oi">Skiing, Cabins, Lakefront, Lake House, etc.</em>, and <em class="oi">Skiing</em> should be shown first if searching in winter.</p><p id="845f" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">In both cases, this created a need for two new ranking algorithms:</p><ul class=""><li id="5245" class="ng nh jg kx b ky kz lb lc le ni li nj lm nk lq nl nm nn no ga"><strong class="kx jh">Category ranking </strong>(green arrow in Figure 9 left): How to rank categories from left to right, by taking into account user origin, season, category popularity, inventory, bookings and user interests</li><li id="f3bf" class="ng nh jg kx b ky np lb nq le nr li ns lm nt lq nl nm nn no ga"><strong class="kx jh">Listing Ranking</strong> (blue arrow in Figure 9 left): given all the listings assigned to the category, rank them from top to bottom by taking into account assigned listing quality tier and whether a given listing was sent to production by humans or by ML models.</li></ul><div class="ls lt lu lv gx o hc"><figure class="ol lw ov on oo op oq paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><picture></picture></div></figure><figure class="ol lw ow on oo op oq paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><picture></picture></div></figure><figure class="ol lw ox on oo op oq paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><picture></picture></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn oy do oz ou">Figure 9. Listing Ranking Logic for Homepage and Location Category Experience</figcaption></figure></div><h1 id="09e3" class="mh mi jg bm mg mj mk ml mm mn mo mp mq km mr kn ms kp mt kq mu ks mv kt mw mx ga">Putting it all together</h1><p id="bedd" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga">To summarize, we presented how we create categories from scratch, first using rules that rely on listing signals and POIs and then with ML with humans in the loop to constantly improve the category. Figure 10 describes the end-to-end flow as it exists today.</p><figure class="ls lt lu lv gx lw gl gm paragraph-image"><div role="button" tabindex="0" class="lx ly do lz ce ma"><div class="gl gm pa"><picture></picture></div></div><figcaption class="md bl gn gl gm me mf bm b bn bo cn">Figure 9: Logic for Category Creation and Improvement over time</figcaption></figure><p id="0220" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga"><em class="oi">Our approach was to </em><strong class="kx jh"><em class="oi">define </em></strong><em class="oi">an acceptable delivery; </em><strong class="kx jh"><em class="oi">prototype </em></strong><em class="oi">several categories to acceptable level; </em><strong class="kx jh"><em class="oi">scale </em></strong><em class="oi">the rest of the categories to the same level;</em><strong class="kx jh"><em class="oi"> revisit </em></strong><em class="oi">the acceptable delivery and improve the product over time.</em></p><p id="f387" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">In Part II, we’ll explain in greater detail the models that categorize listings into categories.</p><h1 id="bfc5" class="mh mi jg bm mg mj mk ml mm mn mo mp mq km mr kn ms kp mt kq mu ks mv kt mw mx ga">Acknowledgments</h1><p id="e7d4" class="pw-post-body-paragraph kv kw jg kx b ky my kh la lb mz kk ld le na lg lh li nb lk ll lm nc lo lp lq iz ga"><em class="oi">We would like to thank everyone involved in the project. Building Airbnb Categories holds a special place in our careers as one of those rare projects where people with different backgrounds and roles came together to work jointly to build something unique.</em></p><p id="e40d" class="pw-post-body-paragraph kv kw jg kx b ky kz kh la lb lc kk ld le lf lg lh li lj lk ll lm ln lo lp lq iz ga">Interested in working at Airbnb? Check out our open roles <a class="au nf" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">here</a>.</p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-and-human-in-the-loop-e97988e70ebb</link>
      <guid>https://medium.com/airbnb-engineering/building-airbnb-categories-with-ml-and-human-in-the-loop-e97988e70ebb</guid>
      <pubDate>Mon, 21 Nov 2022 17:54:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Mussel — Airbnb’s Key-Value Store for Derived Data]]></title>
      <description><![CDATA[<header class="pw-post-byline-header go gp gq gr gs gt gu gv gw gx l"><div class="o gy u"><div class="o"><div class="fj l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@shouyan.guo?source=post_page-----406b9fa1b296--------------------------------"><div class="l do"><img alt="Shouyan guo" class="l ch fl gz ha fp" src="https://miro.medium.com/fit/c/96/96/1*f2pKwLa1FQY45cP3s3N5NQ.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bm b dm dn ga"><div class="hb o hc"><div><div class="ci" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@shouyan.guo?source=post_page-----406b9fa1b296--------------------------------">Shouyan guo</a></div></div><div class="hd he hf hg hh d"></div></div><div class="o ao ht"><p class="pw-published-date bm b bn bo cn">Oct 10</p><div class="hu ci" aria-hidden="true">·</div><div class="pw-reading-time bm b bn bo cn">9 min read</div></div></div></div><div class="o ao"><div class="h k hv hw hx"><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ib o ao"></div><div class="ck ih"></div></div></div><div class="ii ij ik j i d"><div class="fj l"><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="iz ja jb jc jd"><div class=""><p id="d387" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">How Airbnb built a persistent, high availability and low latency key-value storage engine for accessing derived data from offline and streaming events.</strong></p><p id="478a" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">By:</strong> <a class="au lb" href="http://linkedin.com/in/chandramoulir" rel="noopener ugc nofollow" target="_blank">Chandramouli Rangarajan</a>, <a class="au lb" href="http://linkedin.com/in/shouyan-guo" rel="noopener ugc nofollow" target="_blank">Shouyan Guo</a>, <a class="au lb" href="http://linkedin.com/in/yuxijin" rel="noopener ugc nofollow" target="_blank">Yuxi Jin</a></p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm lc"><picture></picture></div></div></figure><h1 id="c844" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Introduction</h1><p id="c399" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Within Airbnb, many online services need access to derived data, which is data computed with large scale data processing engines like Spark or streaming events like Kafka and stored offline. These services require a high quality derived data storage system, with strong reliability, availability, scalability, and latency guarantees for serving online traffic. For example, the user profiler service stores and accesses real-time and historical user activities on Airbnb to deliver a more personalized experience.</p><p id="0b14" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In this post, we will talk about how we leveraged a number of open source technologies, including <a class="au lb" href="https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/HRegion.html" rel="noopener ugc nofollow" target="_blank">HRegion</a>, <a class="au lb" href="https://helix.apache.org/" rel="noopener ugc nofollow" target="_blank">Helix</a>, <a class="au lb" href="https://spark.apache.org/" rel="noopener ugc nofollow" target="_blank">Spark</a>, <a class="au lb" href="https://zookeeper.apache.org/" rel="noopener ugc nofollow" target="_blank">Zookeeper</a>,and <a class="au lb" href="https://kafka.apache.org/" rel="noopener ugc nofollow" target="_blank">Kafka</a> to build a scalable and low latency key-value store for hundreds of Airbnb product and platform use cases.</p><h1 id="b81a" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Derived Data at Airbnb</h1><p id="001b" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Over the past few years, Airbnb has evolved and enhanced our support for serving derived data, moving from teams rolling out custom solutions to a multi-tenant storage platform called Mussel. This evolution can be summarized into three stages:</p><p id="6cd8" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Stage 1 (01/2015): Unified read-only key-value store (HFileService)</strong></p><p id="8021" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Before 2015, there was no unified key-value store solution inside Airbnb that met four key requirements:</p><ol class=""><li id="5b1a" class="mr ms jg kf b kg kh kk kl ko mt ks mu kw mv la mw mx my mz ga">Scale to petabytes of data</li><li id="124d" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Efficient bulk load (batch generation and uploading)</li><li id="be2c" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Low latency reads (&lt;50ms p99)</li><li id="ec9e" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Multi-tenant storage service that can be used by multiple customers</li></ol><p id="af1f" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Also, none of the existing solutions were able to meet these requirements. <a class="au lb" href="https://www.mysql.com/" rel="noopener ugc nofollow" target="_blank">MySQL</a> doesn’t support bulk loading, <a class="au lb" href="https://hbase.apache.org/" rel="noopener ugc nofollow" target="_blank">Hbase</a>’s massive bulk loading (distcp) is not optimal and reliable, RocksDB had no built-in horizontal sharding, and we didn’t have enough C++ expertise to build a bulk load pipeline to support RocksDB file format.</p><p id="ce91" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">So we built HFileService, which internally used <a class="au lb" href="http://devdoc.net/bigdata/hbase-0.98.7-hadoop1/book/hfilev2.html#:~:text=HFile%20is%20a%20low%2Dlevel,to%20write%20those%20inline%20blocks." rel="noopener ugc nofollow" target="_blank">HFile</a> (the building block of Hadoop HBase, which is based on Google’s SSTable):</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nf"><picture></picture></div></div><figcaption class="ng bl gn gl gm nh ni bm b bn bo cn"><em class="nj">Fig. 1: HFileService Architecture</em></figcaption></figure><ol class=""><li id="7b58" class="mr ms jg kf b kg kh kk kl ko mt ks mu kw mv la mw mx my mz ga">Servers were sharded and replicated to address scalability and reliability issues</li><li id="9a07" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">The number of shards was fixed (equivalent to the number of Hadoop reducers in the bulk load jobs) and the mapping of servers to shards stored in Zookeeper. We configured the number of servers mapped to a specific shard by manually changing the mapping in Zookeeper</li><li id="b7f0" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">A daily Hadoop job transformed offline data to HFile format and uploaded it to S3. Each server downloaded the data of their own partitions to local disk and removed the old versions of data</li><li id="ff47" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Different data sources were partitioned by primary key. Clients determined the correct shard their requests should go to by calculating the hash of the primary key and modulo with the total number of shards. Then queried Zookeeper to get a list of servers that had those shards and sent the request to one of them</li></ol><p id="72e1" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Stage 2 (10/2015): Store both real-time and derived data (Nebula)</strong></p><p id="aef3" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">While we built a multi-tenant key-value store that supported efficient bulk load and low latency read, it had its drawbacks. For example, it didn’t support point, low-latency writes, and any update to the stored data had to go through the daily bulk load job. As Airbnb grew, there was an increased need to have low latency access to real-time data.</p><p id="9fee" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Therefore, Nebula was built to support both batch-update and real-time data in a single system. It internally used DynamoDB to store real-time data and S3/HFile to store batch-update data. Nebula introduced timestamp based versioning as a version control mechanism. For read requests, data would be read from both a list of dynamic tables and the static snapshot in HFileService, and the result merged based on timestamp.</p><p id="50c7" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">To minimize online merge operations, Nebula also had scheduled spark jobs that ran daily and merged snapshots of DynamoDB data with the static snapshot of HFileService. Zookeeper was used to coordinate write availability of dynamic tables, snapshots being marked ready for read, and dropping of stale tables.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nk"><picture></picture></div></div><figcaption class="ng bl gn gl gm nh ni bm b bn bo cn">Fig. 2: Nebula Architecture</figcaption></figure><p id="564c" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Stage 3 (2018): Scalable and low latency key-value storage engine (Mussel)</strong></p><p id="0032" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In Stage 3, we built a system that supported both read and write on real-time and batch-update data with timestamp-based conflict resolution. However, there were opportunities for improvement:</p><ol class=""><li id="aae4" class="mr ms jg kf b kg kh kk kl ko mt ks mu kw mv la mw mx my mz ga">Scale-out challenge: It was cumbersome to manually edit partition mappings inside Zookeeper with increasing data growth, or to horizontally scale the system for increasing traffic by adding additional nodes</li><li id="c079" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Improve read performance under spiky write traffic</li><li id="b2e7" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">High maintenance overhead: We needed to maintain HFileService and DynamoDB at the same time</li><li id="f86e" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Inefficient merging process: The process of merging the delta update from DynamoDB and HFileService daily became very slow as our total data size became larger. The daily update data in DynamoDB was just 1–2% of the baseline data in HFileService. However, we re-published the full snapshot (102% of total data size) back to HFileService daily</li></ol><p id="f4a1" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">To solve the drawbacks, we came up with a new key-value store system called <strong class="kf jh">Mussel</strong>.</p><ol class=""><li id="87ad" class="mr ms jg kf b kg kh kk kl ko mt ks mu kw mv la mw mx my mz ga">We introduced Helix to manage the partition mapping within the cluster</li><li id="fff8" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">We leveraged Kafka as a replication log to replicate the write to all of the replicas instead of writing directly to the Mussel store</li><li id="525d" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">We used HRegion as the only storage engine in the Mussel storage nodes</li><li id="7b97" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">We built a Spark pipeline to load the data from the data warehouse into storage nodes directly</li></ol><p id="675e" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Let’s go into more details in the following paragraphs.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nl"><picture></picture></div></div><figcaption class="ng bl gn gl gm nh ni bm b bn bo cn">Fig. 3: Mussel Architecture</figcaption></figure><p id="dde9" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Manage partitions with Helix</strong></p><p id="5824" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In Mussel, in order to make our cluster more scalable, we increased the number of shards from 8 in HFileService to 1024. In Mussel, data is partitioned into those shards by the hash of the primary keys, so we introduced Apache Helix to manage these many logical shards. Helix manages the mapping of logical shards to physical storage nodes automatically. Each Mussel storage node could hold multiple logical shards. Each logical shard is replicated across multiple Mussel storage nodes.</p><p id="08b1" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Leaderless Replication with Kafka</strong></p><p id="ded9" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Since Mussel is a read-heavy store, we adopted a leaderless architecture. Read requests could be served by any of the Mussel storage nodes that have the same logical shard, which increases read scalability. In the write path, we needed to consider the following:</p><ol class=""><li id="d984" class="mr ms jg kf b kg kh kk kl ko mt ks mu kw mv la mw mx my mz ga">We want to smooth the write traffic to avoid the impact on the read path</li><li id="ff02" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Since we don’t have the leader node in each shard, we need a way to make sure each Mussel storage node applies the write requests in the same order so the data is consistent across different nodes</li></ol><p id="fea7" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">To solve these problems, we introduced Kafka as a write-ahead-log here. For write requests, instead of directly writing to the Mussel storage node, it’ll first write to Kafka asynchronously. We have 1024 partitions for the Kafka topic, each partition belonging to one logical shard in the Mussel. Each Mussel storage node will poll the events from Kafka and apply the change to its local store. Since there is no leader-follower relationship between the shards, this configuration allows the correct write ordering within a partition, ensuring consistent updates. The drawback here is that it can only provide eventual consistency. However, given the derived data use case, it is an acceptable tradeoff to compromise on consistency in the interest of ensuring availability and partition tolerance.</p><p id="331d" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Supporting both read, write, and compaction in one storage engine</strong></p><p id="6328" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In order to reduce the hardware cost and operational load of managing DynamoDB, we decided to remove it and extend HFileService as the only storage engine to serve both real-time and offline data. To better support both read and write operations, we used <a class="au lb" href="https://hbase.apache.org/1.1/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html" rel="noopener ugc nofollow" target="_blank">HRegion</a> instead of <a class="au lb" href="https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/HFile.html" rel="noopener ugc nofollow" target="_blank">Hfile</a>. HRegion is a fully functional key-value store with MemStore and BlockCache. Internally it uses a Log Structured Merged (LSM) Tree to store the data and supports both read and write operations.</p><p id="5035" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">An HRegion table contains column families, which are the logical and physical grouping of columns. There are column qualifiers inside of a column family, which are the columns. Column families contain columns with time stamped versions. Columns only exist when they are inserted, which makes HRegion a sparse database. We mapped our client data to HRegion as the following:</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nm"><picture></picture></div></div></figure><p id="478e" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">With this mapping, for read queries, we’re able to support:</p><ol class=""><li id="8e12" class="mr ms jg kf b kg kh kk kl ko mt ks mu kw mv la mw mx my mz ga">Point query by looking up the data with primary key</li><li id="2132" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Prefix/range query by scanning data on secondary key</li><li id="fb3e" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Queries for the latest data or data within a specific time range, as both real-time and offline data written to Mussel will have a timestamp</li></ol><p id="113b" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Because we have over 4000 client tables in Mussel, each user table is mapped to a column family in HRegion instead of its own table to reduce scalability challenges at the metadata management layer. Also, as HRegion is a column-based storage engine, each column family is stored in a separate file so they can be read/written independently.</p><p id="3522" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">For write requests, it consumes the write request from Kafka and calls the HRegion put API to write the data directly. For each table, it can also support customizing the max version and TTL (time-to-live).</p><p id="79fc" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">When we serve write requests with HRegion, another thing to consider is compaction. Compaction needs to be run in order to clean up data that is deleted or has reached max version or max TTL. Also when the MemStore in HRegion reaches a certain size, it is flushed to disk into a StoreFile. Compaction will merge those files together in order to reduce disk seek and improve read performance. However, on the other hand, when compaction is running, it causes higher cpu and memory usage and blocks writes to prevent JVM (Java Virtual Machine) heap exhaustion, which impacts the read and write performance of the cluster.</p><p id="1628" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Here we use Helix to mark Mussel storage nodes for each logical shard into two types of resources: online nodes and batch nodes. For example, if we have 9 Mussel storage nodes for one logical shard, 6 of them are online nodes and 3 of them are batch nodes. The relationship between online and batch are:</p><ol class=""><li id="70fe" class="mr ms jg kf b kg kh kk kl ko mt ks mu kw mv la mw mx my mz ga">They both serve write requests</li><li id="cab9" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Only online nodes serve read requests and we rate limit the compaction on online nodes to have good read performance</li><li id="3bdc" class="mr ms jg kf b kg na kk nb ko nc ks nd kw ne la mw mx my mz ga">Helix schedules a daily rotation between online nodes and batch nodes. In the example above, it moves 3 online nodes to batch and 3 batch nodes to online so those 3 new batch nodes can perform full speed major compaction to clean up old data</li></ol><p id="d03a" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">With this change, now we’re able to support both read and write with a single storage engine.</p><p id="9e25" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><strong class="kf jh">Supporting bulk load from data warehouse</strong></p><p id="7d4d" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We support two types of bulk load pipelines from data warehouse to Mussel via <a class="au lb" href="https://airflow.apache.org/" rel="noopener ugc nofollow" target="_blank">Airflow</a> jobs: merge type and replace type. Merge type means merging the data from the data warehouse and the data from previous write with older timestamps in Mussel. Replace means importing the data from the data warehouse and deleting all the data with previous timestamps.</p><p id="d82a" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We utilize Spark to transform data from the data warehouse into HFile format and upload to S3. Each Mussel storage node downloads the files and uses HRegion bulkLoadHFiles API to load those HFiles into the column family.</p><p id="bf6e" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">With this bulk load pipeline, we can just load the delta data into the cluster instead of the full data snapshot every day. Before the migration, the user profile service needed to load about 4TB data into the cluster daily. After, it only needs to load about 40–80GB, drastically reducing the cost and improving the performance of the cluster.</p><h1 id="48a2" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Conclusion and Next Steps</h1><p id="b011" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">In the last few years, Airbnb has come a long way in providing a high-quality derived data store for our engineers. The most recent key-value store Mussel is widely used within Airbnb and has become a foundational building block for any key-value based application with strong reliability, availability, scalability, and performance guarantees. Since its introduction, there have been ~4000 tables created in Mussel, storing ~130TB data in our production clusters without replication. Mussel has been working reliably to serve large amounts of read, write, and bulk load requests: For example, mussel-general, our largest cluster, has achieved &gt;99.9% availability, average read QPS &gt; 800k and write QPS &gt; 35k, with average P95 read latency less than 8ms.</p><p id="ed09" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Even though Mussel can serve our current use cases well, there are still many opportunities to improve. For example, we’re looking forward to providing the read-after-write consistency to our customers. We also want to enable auto-scale and repartition based on the traffic in the cluster. We’re looking forward to sharing more details about this soon.</p><h1 id="9529" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Acknowledgments</h1><p id="5279" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Mussel is a collaborative effort of Airbnb’s storage team including: <a class="au lb" href="http://linkedin.com/in/calvinzou" rel="noopener ugc nofollow" target="_blank">Calvin Zou</a>, <a class="au lb" href="https://medium.com/airbnb-engineering/linkedin.com/in/dionitas" rel="noopener ugc nofollow" target="_blank">Dionitas Santos</a>, <a class="au lb" href="http://linkedin.com/in/ruan-maia-367281161" rel="noopener ugc nofollow" target="_blank">Ruan Maia</a>, <a class="au lb" href="http://linkedin.com/in/wonheec" rel="noopener ugc nofollow" target="_blank">Wonhee Cho</a>, <a class="au lb" href="https://medium.com/airbnb-engineering/linkedin.com/in/xiaomou-wang-5880b537" rel="noopener ugc nofollow" target="_blank">Xiaomou Wang</a>, <a class="au lb" href="https://medium.com/airbnb-engineering/linkedin.com/in/yanhan-zhang-724088a4" rel="noopener ugc nofollow" target="_blank">Yanhan Zhang</a>.</p><p id="5780" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Interested in working on the Airbnb Storage team? Check out this role: <a class="au lb" href="https://careers.airbnb.com/positions/3029584/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Distributed Storage</a></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/mussel-airbnbs-key-value-store-for-derived-data-406b9fa1b296</link>
      <guid>https://medium.com/airbnb-engineering/mussel-airbnbs-key-value-store-for-derived-data-406b9fa1b296</guid>
      <pubDate>Mon, 10 Oct 2022 19:40:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Beyond A/B test : Speeding up Airbnb Search Ranking Experimentation through Interleaving]]></title>
      <description><![CDATA[<header class="pw-post-byline-header go gp gq gr gs gt gu gv gw gx l"><div class="o gy u"><div class="o"><div class="fj l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@zq.zhangqing?source=post_page-----7087afa09c8e--------------------------------"><div class="l do"><img alt="Qing Zhang" class="l ch fl gz ha fp" src="https://miro.medium.com/fit/c/96/96/1*SxmGNG1DvuCHJVkSiSjIfg.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bm b dm dn ga"><div class="hb o hc"><div><div class="ci" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@zq.zhangqing?source=post_page-----7087afa09c8e--------------------------------">Qing Zhang</a></div></div><div class="hd he hf hg hh d"></div></div><div class="o ao ht"><p class="pw-published-date bm b bn bo cn">Oct 6</p><div class="hu ci" aria-hidden="true">·</div><div class="pw-reading-time bm b bn bo cn">10 min read</div></div></div></div><div class="o ao"><div class="h k hv hw hx"><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ib o ao"></div><div class="ck ih"></div></div></div><div class="ii ij ik j i d"><div class="fj l"><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="iz ja jb jc jd"><div class=""><p id="ed5c" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Introduction of Airbnb interleaving experimentation framework, usage and approaches to address challenges in our unique business</p><p id="8214" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga"><a class="au lb" rel="noopener" href="https://medium.com/@zq.zhangqing">Qing Zhang</a>, <a class="au lb" rel="noopener" href="https://medium.com/@michelle.du">Michelle Du</a>, Reid Andersen, <a class="au lb" rel="noopener" href="https://medium.com/@liweihe">Liwei He</a></p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm lc"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/1*4v8bM6rq3FsK7Zwa7UO7bA.jpeg" width="700" height="467" srcset="https://miro.medium.com/max/640/1*4v8bM6rq3FsK7Zwa7UO7bA.jpeg 640w, https://miro.medium.com/max/720/1*4v8bM6rq3FsK7Zwa7UO7bA.jpeg 720w, https://miro.medium.com/max/750/1*4v8bM6rq3FsK7Zwa7UO7bA.jpeg 750w, https://miro.medium.com/max/786/1*4v8bM6rq3FsK7Zwa7UO7bA.jpeg 786w, https://miro.medium.com/max/828/1*4v8bM6rq3FsK7Zwa7UO7bA.jpeg 828w, https://miro.medium.com/max/1100/1*4v8bM6rq3FsK7Zwa7UO7bA.jpeg 1100w, https://miro.medium.com/max/1400/1*4v8bM6rq3FsK7Zwa7UO7bA.jpeg 1400w" role="presentation" /></div></div></figure><h1 id="2d00" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Introduction</h1><p id="5795" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">When a user searches for a place to stay on Airbnb, we aim to show them the best results possible. Airbnb’s relevance team actively works on improving search ranking experience and helps users to find and book listings that match their preference. A/B test is our approach for online assessment. Our business metrics are conversion-focused, and the frequency of guest travel transactions is lower than on other e-commerce platforms. These factors result in insufficient experiment bandwidth given the number of ideas that we want to test and there is considerable demand to develop a more efficient online testing approach.</p><p id="e468" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Interleaving is an online ranking assessment approach [1–3]. In A/B tests, users are split into control and treatment groups. Those who are in each group will be consistently exposed to results from the corresponding ranker. Interleaving, on the other hand, blends the search results from both control and treatment and presents the “interleaved” results to the user (Figure 1). The mechanism enables direct comparison between the two groups by the same user, with which the impact of the treatment ranker can be evaluated by a collection of specifically designed metrics.</p><p id="8df8" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">There are several challenges in building the framework on both engineering and data science fronts. On the engineering side, we needed to extend our current AB test framework to enable interleaving set up while adding minimum overhead to the ML engineers. Additionally, our search infrastructure is designed for single request search and required significant extension to support interleaving functionality. On the data science side, we designed user event attribution logic that’ key to the effectiveness of metrics.</p><p id="e011" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In 2021, we built the interleaving experimentation framework and integrated it in our experiment process and reached a 50x sensitivity in the development of our search ranking algorithm. Further validation confirms high agreement with A/B tests. We have been using interleaving for a wide range of tasks such as ranker assessment, hyperparameter tuning as well as evaluating infra-level changes. The system design and learnings detailed in this blog post should benefit readers looking to improve their experimentation agility.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm mr"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*UUyBfFnZWWa13Mxk" width="700" height="259" srcset="https://miro.medium.com/max/640/0*UUyBfFnZWWa13Mxk 640w, https://miro.medium.com/max/720/0*UUyBfFnZWWa13Mxk 720w, https://miro.medium.com/max/750/0*UUyBfFnZWWa13Mxk 750w, https://miro.medium.com/max/786/0*UUyBfFnZWWa13Mxk 786w, https://miro.medium.com/max/828/0*UUyBfFnZWWa13Mxk 828w, https://miro.medium.com/max/1100/0*UUyBfFnZWWa13Mxk 1100w, https://miro.medium.com/max/1400/0*UUyBfFnZWWa13Mxk 1400w" role="presentation" /></div></div></figure><p id="cb26" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Figure 1: An illustration of A/B testing v.s. Interleaving. In traditional A/B tests, users are split into two groups and exposed to two different rankers. In Interleaving, each user is presented with the blended results from two rankers.</p><h1 id="76b6" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Search Ranking Experimentation Procedure</h1><p id="d904" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">With interleaving, Airbnb search ranking experimentation uses a three phase procedure for faster experimentation (Figure 2). First, we run standard offline evaluation on the ranker with NDCG (normalized discounted cumulative gain). Rankers with reasonable results move on to online evaluation with interleaving. The ones that get promising results go on for the A/B test.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm ms"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*LDlMakrih7JAiFkx" width="700" height="122" srcset="https://miro.medium.com/max/640/0*LDlMakrih7JAiFkx 640w, https://miro.medium.com/max/720/0*LDlMakrih7JAiFkx 720w, https://miro.medium.com/max/750/0*LDlMakrih7JAiFkx 750w, https://miro.medium.com/max/786/0*LDlMakrih7JAiFkx 786w, https://miro.medium.com/max/828/0*LDlMakrih7JAiFkx 828w, https://miro.medium.com/max/1100/0*LDlMakrih7JAiFkx 1100w, https://miro.medium.com/max/1400/0*LDlMakrih7JAiFkx 1400w" role="presentation" /></div></div></figure><p id="7e3c" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Figure 2: Ranking experimentation procedure. We use interleaving to get preliminary online results in order to enable fast iteration</p><p id="664e" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Currently, we split our search traffic into two portions, and use the vast majority for regular A/B tests and remaining for interleaving experiments. We divide the interleaving traffic into buckets (called interleaving lanes) and each lane is used for one interleaving experiment. Each interleaving experiment takes up about 6% of regular A/B test traffic, and one-third of running length. We achieve a 50x speedup over an A/B test given the same amount of traffic. The team now has the luxury to test out multiple variations of the idea in a short time frame and identify the promising routes to move forward.</p><h1 id="b623" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Airbnb Interleaving Framework</h1><p id="1f00" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">The interleaving framework controls the experimentation traffic and generates interleaved results to return to the user as illustrated in Figure 3. Specifically, for users who are subject to interleaving, the system creates parallel search requests that correspond to control and treatment rankers and produce responses. The results generation component blends the two responses with team drafting algorithms, returns the final response to the user, and creates logging. A suite of metrics were designed to measure impact.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm mt"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*xrxh0CYHcHielAP2" width="700" height="214" srcset="https://miro.medium.com/max/640/0*xrxh0CYHcHielAP2 640w, https://miro.medium.com/max/720/0*xrxh0CYHcHielAP2 720w, https://miro.medium.com/max/750/0*xrxh0CYHcHielAP2 750w, https://miro.medium.com/max/786/0*xrxh0CYHcHielAP2 786w, https://miro.medium.com/max/828/0*xrxh0CYHcHielAP2 828w, https://miro.medium.com/max/1100/0*xrxh0CYHcHielAP2 1100w, https://miro.medium.com/max/1400/0*xrxh0CYHcHielAP2 1400w" role="presentation" /></div></div></figure><p id="2ec9" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Figure 3: Interleaving system overview. The interleaving framework controls the experimentation traffic and generates interleaved results to return to the user</p><h1 id="2e0f" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Team Drafting and Competitive Pairs</h1><p id="c000" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">The framework employs the<em class="mu"> team drafting algorithm</em> to “interleave” the results from control and treatment (we call them teams). For the purpose of generalizability, we demonstrate the drafting process with two teams A and B. The steps of the algorithm are as follows:</p><p id="f101" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">1 Flip a coin to determine if team A goes first</p><p id="8866" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">2 Start with an empty merged list. Repeat the following step until desired size is reached,</p><p id="78d2" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">2. 1 From each of the two rankers A and B take the highest-ranked result that has not yet been selected (say listing a from ranker A and e from ranker B).</p><p id="c923" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">2.2 If the two listings are different, then select listings a and e, with assigned a to A and e assigned B. We will call (a, e) a <em class="mu">competitive pair</em>. Add the pair to the merged list with the order decided in Step 1</p><p id="a8e0" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">2.3 If the two listings are the same, then select that listing and do not assign it to either team. Figure 4 demonstrates the process.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm mv"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*pqLeLNEtmZEQ2D9o" width="700" height="287" srcset="https://miro.medium.com/max/640/0*pqLeLNEtmZEQ2D9o 640w, https://miro.medium.com/max/720/0*pqLeLNEtmZEQ2D9o 720w, https://miro.medium.com/max/750/0*pqLeLNEtmZEQ2D9o 750w, https://miro.medium.com/max/786/0*pqLeLNEtmZEQ2D9o 786w, https://miro.medium.com/max/828/0*pqLeLNEtmZEQ2D9o 828w, https://miro.medium.com/max/1100/0*pqLeLNEtmZEQ2D9o 1100w, https://miro.medium.com/max/1400/0*pqLeLNEtmZEQ2D9o 1400w" role="presentation" /></div></div></figure><p id="39f4" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Figure 4: Team drafting example with competitive pair explained. Here we assume that team A goes first based on coin flip.</p><p id="6cce" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">The<em class="mu"> team drafting algorithm</em> enables us to measure user preference in a fair way. For each request we flip a coin to decide which team (control or treatment) has the priority in the ordering of a <em class="mu">competitive pair</em>. This means that position bias is minimized as listings from each team are ranked above the one from the other team in the competitive pair half of the time.</p><p id="0a3d" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Creating <em class="mu">competitive pairs</em> makes <a class="au lb" href="https://alexdeng.github.io/causal/sensitivity.html#vrreg" rel="noopener ugc nofollow" target="_blank">variance reduction</a> (a procedure to speed up experimentation by increasing the precision of the point estimates) more intuitive, since it deduplicates items with the same rank and only assigns scores to the impression of competitive pairs instead of to each impression. In the example in Figure 4, the comparison between ranker A and ranker B reduces to a referendum on whether <em class="mu">a</em> is better than <em class="mu">e</em>. Leaving the other results unassigned improves the sensitivity in this case. In an extreme case where two rankers produce lists with exactly the same order, traditional interleaving would still associate clicks to teams and add noise to the result; while with competitive pairs, the entire search query can be ignored since the preference is exactly zero. This allows us to focus on the real difference with sensitivity improvement.</p><p id="8ae3" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Furthermore, competitive pairs enable us to allocate credits to various user activities downstream much more easily. Again unlike traditional interleaving, which mostly assigns credits for clicks [3–5], we assign credits by bookings, which is a downstream activity. The flexibility in credit association has empowered us to design complicated metrics without having to rely on click signals. For example, we are able to define metrics that measure the booking wins over competition with certain types of listings (e.g. new listings) in the pairs. This enabled us to further understand whether changes to the ranking of a specific category of listings played its role in interleaving overall.</p><p id="b4cb" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">To determine a winning ranker in our interleaving approach, we compare the <em class="mu">preference margin</em> (margin of victory for the winning team) on target events and apply a 1-sample t-test over it to obtain the p-value. Validation studies confirmed that our framework produces results that are both reliable and robust — with a consistently low false positive rate, and minimum carryover effect between experiments.</p><h1 id="c1d6" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Attribution</h1><p id="1026" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga"><em class="mu">Attribution logic</em> is a key component of our measurement framework. As mentioned earlier, a typical scenario that is more unique to Airbnb compared to cases like Web search or streaming sites is that our guests can issue multiple search requests before booking, and the listing they book may have been viewed or clicked multiple times when owned by different interleaving teams, which is different from use cases where the primary goal is click-based conversion.</p><p id="d099" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Let’s use a toy example to demonstrate the concept. As shown in Figure 5, the guest clicked the booked listing 3 times with each ranker having the listing on their team multiple times (2 times on team A, 1 time on team B) throughout the search journey. For this single guest alone, we see how the different attribution methods can end up with different conclusions:</p><ul class=""><li id="8dc3" class="mw mx jg kf b kg kh kk kl ko my ks mz kw na la nb nc nd ne ga">If we attribute the booking to the team when it was first clicked, we should assign it to team B and declare team B as the winner for this guest;</li><li id="8363" class="mw mx jg kf b kg nf kk ng ko nh ks ni kw nj la nb nc nd ne ga">If we attribute the booking to the team when it was last clicked, we should assign it to team A and declare team A as the winner for the guest;</li><li id="d283" class="mw mx jg kf b kg nf kk ng ko nh ks ni kw nj la nb nc nd ne ga">If we attribute the booking every time it was clicked, we should assign it twice to team A and once to team B, and end up declaring team A being the winner for the guest.</li></ul><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nk"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*7WbbbuEdUHlirDp3" width="700" height="336" srcset="https://miro.medium.com/max/640/0*7WbbbuEdUHlirDp3 640w, https://miro.medium.com/max/720/0*7WbbbuEdUHlirDp3 720w, https://miro.medium.com/max/750/0*7WbbbuEdUHlirDp3 750w, https://miro.medium.com/max/786/0*7WbbbuEdUHlirDp3 786w, https://miro.medium.com/max/828/0*7WbbbuEdUHlirDp3 828w, https://miro.medium.com/max/1100/0*7WbbbuEdUHlirDp3 1100w, https://miro.medium.com/max/1400/0*7WbbbuEdUHlirDp3 1400w" role="presentation" /></div></div></figure><p id="355b" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Figure 5: A simplified example of guest journey. The guest emits multiple searches and views the booked listing multiple times before finally making a booking.</p><p id="5512" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We created multiple attribution logic variations and evaluated them on a wide collection of interleaving experiments that also had A/B runs as “ground truth”. We set our primary metric to be the one that has best alignment between interleaving and A/B tests.</p><h1 id="d3da" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Alignment with A/B tests</h1><p id="c0f4" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">To further evaluate the consistency between interleaving and A/B tests, we tracked eligible interleaving and A/B pairs and confirmed that the two are consistent with each other 82% of the time (Figure 6). The experiments are also highly sensitive as noted in previous work from other companies like Netflix. To provide a concrete example, we have a ranker that randomly picks a listing in the top 300 results and inserts it to the top slot. It takes interleaving only 0.5% of the A/B running time and 4% of A/B traffic to get to the same conclusion as its corresponding A/B test.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div class="gl gm nl"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1066/0*TeUKjwEQznkN9wUW" width="533" height="432" srcset="https://miro.medium.com/max/640/0*TeUKjwEQznkN9wUW 640w, https://miro.medium.com/max/720/0*TeUKjwEQznkN9wUW 720w, https://miro.medium.com/max/750/0*TeUKjwEQznkN9wUW 750w, https://miro.medium.com/max/786/0*TeUKjwEQznkN9wUW 786w, https://miro.medium.com/max/828/0*TeUKjwEQznkN9wUW 828w, https://miro.medium.com/max/1100/0*TeUKjwEQznkN9wUW 1100w, https://miro.medium.com/max/1066/0*TeUKjwEQznkN9wUW 1066w" role="presentation" /></div></figure><p id="9490" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Figure 6: Interleaving and A/B consistency. We tracked eligible interleaving and A/B pairs and the results demonstrate that the two are consistent with each other 82% of the time</p><p id="5930" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">In most cases where interleaving turned out to be inconsistent with traditional A/B testing, we found that the reason was set-level optimization. For example, one ranker relies on a model to determine how strongly it will demote listings with high host rejection probability and the model is the booking probability given the current page. Interleaving breaks this assumption and leads to inaccurate results. Based on our learnings, we advise that rankers that involve set-level optimization should use interleaving on a case by case basis.</p><h1 id="0c8c" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Conclusion</h1><p id="6e22" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Search ranking quality is key for an Airbnb user to find their desired accommodation and iterating on the algorithm efficiently is our top priority. The interleaving experimentation framework tackles our problem of limited A/B test bandwidth and provides up to 50x speed up on the search ranking algorithm iteration. We conducted comprehensive validation which demonstrated that interleaving is highly robust and has strong correlation with traditional A/B. Interleaving is currently part of our experimentation procedure, and is the main evaluation technique before the A/B test. The framework opens a new field of online experimentation for the company and can be applied to other product surfaces such as recommendations.</p><p id="c5da" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Interested in working at Airbnb? Check out our open roles <a class="au lb" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">HERE</a>.</p><h1 id="2a24" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Acknowledgments</h1><p id="3b78" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">We would like to thank Aaron Yin for the guidance on the implementations of algorithms and metrics, Xin Liu for continuously advising us on optimizing and extending the framework to support more use cases, Chunhow Tan for valuable suggestions on improving the computational efficiency of interleaving metrics and Tatiana Xifara for advice on experiment delivery design.</p><p id="550c" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">The system won’t be possible without the support from our search backend team, especially Yangbo Zhu, Eric Wu, Varun Sharma and Soumyadip (Soumo) Banerjee. We benefit tremendously from their design advice and close collaboration on the operations.</p><p id="d688" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We would also like to thank Alex Deng, Huiji Gao and Sanjeev Katariya for valuable feedback on the interleaving and this article.</p><h1 id="3329" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">References</h1><p id="3819" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">[1] JOACHIMS, T. Optimizing Search Engines Using Clickthrough Data. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (KDD). ACM, New York, NY, 132–142. 2002.</p><p id="cb0a" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">[2] JOACHIMS, T. Evaluating Retrieval Performance using Clickthrough Data. In Text Mining, J. Franke, G. Nakhaeizadeh, and I. Renz, Eds., Physica/Springer Verlag, 79–96. 2003.</p><p id="1dea" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">[3] RADLINSKI, F., KURUP, M., AND JOACHIMS, T. How does clickthrough data reflect retrieval quality. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM’08). ACM, New York, NY, 43–52. 2008.</p><p id="7da3" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">[4] Radlinski, Filip, and Nick Craswell. “Optimized interleaving for online retrieval evaluation.” Proceedings of the sixth ACM international conference on Web search and data mining. 2013.</p><p id="d376" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">[5] Hofmann, Katja, Shimon Whiteson, and Maarten De Rijke. “A probabilistic method for inferring preferences from clicks.” Proceedings of the 20th ACM international conference on Information and knowledge management. 2011.</p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/beyond-a-b-test-speeding-up-airbnb-search-ranking-experimentation-through-interleaving-7087afa09c8e</link>
      <guid>https://medium.com/airbnb-engineering/beyond-a-b-test-speeding-up-airbnb-search-ranking-experimentation-through-interleaving-7087afa09c8e</guid>
      <pubDate>Thu, 06 Oct 2022 17:52:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Upgrading Data Warehouse Infrastructure at Airbnb]]></title>
      <description><![CDATA[<header class="pw-post-byline-header go gp gq gr gs gt gu gv gw gx l"><div class="o gy u"><div class="o"><div class="fj l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@ronnie.zhu?source=post_page-----a4e18f09b6d5--------------------------------"><div class="l do"><img alt="Ronnie Zhu" class="l ch fl gz ha fp" src="https://miro.medium.com/fit/c/96/96/0*-2nAGNWAU7HX_k9-" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bm b dm dn ga"><div class="hb o hc"><div><div class="ci" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@ronnie.zhu?source=post_page-----a4e18f09b6d5--------------------------------">Ronnie Zhu</a></div></div><div class="hd he hf hg hh d"></div></div><div class="o ao ht"><p class="pw-published-date bm b bn bo cn">Sep 26</p><div class="hu ci" aria-hidden="true">·</div><div class="pw-reading-time bm b bn bo cn">10 min read</div></div></div></div><div class="o ao"><div class="h k hv hw hx"><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="hy l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ib o ao"></div><div class="ck ih"></div></div></div><div class="ii ij ik j i d"><div class="fj l"><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="ip l fr"><div><div class="ci" aria-hidden="false"></div></div><div class="l fr"><div><div class="ci" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="iz ja jb jc jd"><div class=""><p id="f157" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">This blog aims to introduce Airbnb’s experience upgrading Data Warehouse infrastructure to Spark and Iceberg.</p><p id="56fc" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">By: <a class="au lb" href="https://www.linkedin.com/in/huirong-ronnie-zhu-97b0a980/" rel="noopener ugc nofollow" target="_blank">Ronnie Zhu</a>, <a class="au lb" href="https://www.linkedin.com/in/edgarrd/" rel="noopener ugc nofollow" target="_blank">Edgar Rodriguez</a>, <a class="au lb" href="https://www.linkedin.com/in/qiang-jason-xu-7101b025/" rel="noopener ugc nofollow" target="_blank">Jason Xu</a>, <a class="au lb" href="https://www.linkedin.com/in/gustavo-torres-torres/" rel="noopener ugc nofollow" target="_blank">Gustavo Torres</a>, <a class="au lb" href="https://www.linkedin.com/in/kerimoktay" rel="noopener ugc nofollow" target="_blank">Kerim Oktay</a>, <a class="au lb" href="https://www.linkedin.com/in/zhangxu325/" rel="noopener ugc nofollow" target="_blank">Xu Zhang</a></p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm lc"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/1*Ky-obyCnt-A0R4qjnoY1_g.jpeg" width="700" height="468" srcset="https://miro.medium.com/max/640/1*Ky-obyCnt-A0R4qjnoY1_g.jpeg 640w, https://miro.medium.com/max/720/1*Ky-obyCnt-A0R4qjnoY1_g.jpeg 720w, https://miro.medium.com/max/750/1*Ky-obyCnt-A0R4qjnoY1_g.jpeg 750w, https://miro.medium.com/max/786/1*Ky-obyCnt-A0R4qjnoY1_g.jpeg 786w, https://miro.medium.com/max/828/1*Ky-obyCnt-A0R4qjnoY1_g.jpeg 828w, https://miro.medium.com/max/1100/1*Ky-obyCnt-A0R4qjnoY1_g.jpeg 1100w, https://miro.medium.com/max/1400/1*Ky-obyCnt-A0R4qjnoY1_g.jpeg 1400w" role="presentation" /></div></div></figure><h1 id="7abc" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Introduction</h1><p id="0829" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">In this blog, we will introduce our motivations for upgrading our Data Warehouse Infrastructure to Spark 3 and Iceberg. We will briefly describe the current state of Airbnb data warehouse infrastructure and the challenges. We will then share our learnings from upgrading one critical production workload: event data ingestion. Finally, we will share the results and the lessons learned.</p><h1 id="7ca3" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Context</h1><p id="ad7a" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Airbnb’s Data Warehouse (DW) storage was previously migrated from legacy <a class="au lb" rel="noopener" href="https://medium.com/airbnb-engineering/data-infrastructure-at-airbnb-8adfb34f169c">HDFS clusters</a> to S3 to provide better stability and scalability. While our team has continued to improve the reliability and stability of the workloads that operate on data in S3, certain characteristics of these workloads and the infrastructure they depend on introduce scalability and productivity limitations that our users encounter on a regular basis.</p><h2 id="1cb4" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">Challenges</h2><h2 id="b5fe" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">Hive Metastore</h2><p id="46e8" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">With an increasing number of partitions, Hive’s backend DBMS’s load has become a bottleneck, as has the load on partition operations (e.g., querying thousands of partitions for a month’s worth of data). As a workaround, we usually add a stage of daily aggregation and keep two tables for queries of different time granularities (e.g., hourly and daily). To save on storage, we limit intraday Hive tables to short retention (three days), and keep daily tables for longer retention (several years).</p><h2 id="604f" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">Hive/S3 Interactions</h2><p id="c3ab" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Hive was not originally designed for object storage. Instead, many assumptions were made around HDFS when implementing features such as renames and file listings. When we migrated from HDFS to S3 it therefore required certain guarantees to ensure that datasets were consistent on list-after-write operations. We customized the way Hive writes to S3, first writing to an HDFS temporary cluster and then moving the data to S3 via an optimized distcp process that writes to unique locations during the commit phase, storing file-listing information in a separate store for fast access. This process has performed well over the past two years, but it requires additional cluster resources to run.</p><h2 id="848f" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">Schema Evolution</h2><p id="0e36" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">At Airbnb, we use three compute engines to access data in our Data Warehouse: Spark, Trino and Hive. Since each compute engine handles schema changes differently, changes to table schemas have almost always resulted in data quality issues or required engineers to perform costly rewrites.</p><h2 id="6fd0" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">Partitioning</h2><p id="1b27" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Hive tables are partitioned by fixed columns, and partition columns cannot be easily changed. In case one needs to repartition a dataset, one has to create a new table and reload the entire dataset.</p><h1 id="9517" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">New Data Stack</h1><p id="74dc" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">These challenges have motivated us to upgrade our Data Warehouse infrastructure to a new stack based on Iceberg and Spark 3, which addresses these problems and also provides usability improvements.</p><h2 id="094b" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">Iceberg</h2><p id="edb1" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga"><a class="au lb" href="https://iceberg.apache.org/docs/latest/" rel="noopener ugc nofollow" target="_blank">Apache Iceberg</a> is a table format designed to address several of the shortcomings of traditional file system-based Data Warehousing storage formats such as Hive. Iceberg is designed to deliver high-performance reads for huge analytics tables, with features such as serializable isolation, snapshot-based time travel, and predictable schema evolution. Some important Iceberg features that help in some of the challenges mentioned early:</p><ul class=""><li id="fe83" class="nf ng jg kf b kg kh kk kl ko nh ks ni kw nj la nk nl nm nn ga">Partition information is not stored in the Hive metastore, hence removing a large source of load to the metastore.</li><li id="e4cd" class="nf ng jg kf b kg no kk np ko nq ks nr kw ns la nk nl nm nn ga">Iceberg tables do not require S3 listings, which removes the list-after-write consistency requirement, which can in turn eliminate the need for the extra discp job, and avoids entirely the latency of the list operation.</li><li id="ed07" class="nf ng jg kf b kg no kk np ko nq ks nr kw ns la nk nl nm nn ga">Consistent table schema is defined in <a class="au lb" href="https://iceberg.apache.org/spec/#schema-evolution" rel="noopener ugc nofollow" target="_blank">Iceberg spec</a>, which guarantees consistent behavior across compute engines avoiding unexpected behavior when changing columns.</li></ul><h2 id="46dd" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">Spark 3</h2><p id="5114" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga"><a class="au lb" href="https://spark.apache.org/" rel="noopener ugc nofollow" target="_blank">Apache Spark</a> has become the de facto standard for big data processing in the past 10 years. Spark 3 is a new major version released in 2020, it comes with a long list of features — new functionalities, bug fixes and performance improvements. We focus on introducing Adaptive Query Execution (AQE) here; you can find more info on the <a class="au lb" href="https://databricks.com/blog/2020/06/18/introducing-apache-spark-3-0-now-available-in-databricks-runtime-7-0.html" rel="noopener ugc nofollow" target="_blank">Databricks blog</a>.</p><p id="80ba" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">AQE is a query optimization technique that uses runtime statistics to optimize the Spark query execution plan. This solves one of the greatest struggles of Spark cost-based optimization — inaccurate statistics collected before query starts often lead to suboptimal query plans. AQE will figure out data characteristics and improve query plans as the query runs, increasing query performance.</p><p id="cc84" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Spark 3 is also a prerequisite for Iceberg adoption. Iceberg table write and read support using Spark SQL is only available on Spark 3.</p><p id="2298" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">The diagram below shows the change we made:</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nt"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*cIYLTqi2XeFm3vxb" width="700" height="341" srcset="https://miro.medium.com/max/640/0*cIYLTqi2XeFm3vxb 640w, https://miro.medium.com/max/720/0*cIYLTqi2XeFm3vxb 720w, https://miro.medium.com/max/750/0*cIYLTqi2XeFm3vxb 750w, https://miro.medium.com/max/786/0*cIYLTqi2XeFm3vxb 786w, https://miro.medium.com/max/828/0*cIYLTqi2XeFm3vxb 828w, https://miro.medium.com/max/1100/0*cIYLTqi2XeFm3vxb 1100w, https://miro.medium.com/max/1400/0*cIYLTqi2XeFm3vxb 1400w" role="presentation" /></div></div><figcaption class="nu bl gn gl gm nv nw bm b bn bo cn"><strong class="bm lq"><em class="nx">Figure 1.</em></strong><em class="nx"> Evolution of data compute and storage tech stack</em></figcaption></figure><h1 id="c179" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Production Case Study — Data Ingestion</h1><p id="da99" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">At Airbnb, the Hive-based data ingestion framework processes &gt;35 billion Kafka event messages and 1,000+ tables per day, and lands datasets ranging from kilobytes to terabytes into hourly and daily partitions. The volume and coverage of datasets of different sizes, and time granularity requirement makes this framework a good candidate to benefit from our Spark+Iceberg tech stack.</p><h1 id="7ba9" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Spark 3</h1><p id="9b68" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">The first step in migrating to the aforementioned Spark+Iceberg compute tech stack was to move our Hive queries to Spark. This introduced a new challenge: Spark tuning. Unlike Hive, which relies on data volume stats, Spark uses preset shuffle partition values to determine task split sizes. Thus, choosing the proper number of shuffle partitions became a big challenge in tuning the event data ingestion framework on Spark. Data volume of different events varies a lot, and the data size of one event also changes over time. Figure 2 shows the high variance of shuffle data size of Spark jobs processing a sampling of 100 different types of events.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div class="gl gm ny"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1200/0*qAIItnnv9VtcG9nz" width="600" height="371" srcset="https://miro.medium.com/max/640/0*qAIItnnv9VtcG9nz 640w, https://miro.medium.com/max/720/0*qAIItnnv9VtcG9nz 720w, https://miro.medium.com/max/750/0*qAIItnnv9VtcG9nz 750w, https://miro.medium.com/max/786/0*qAIItnnv9VtcG9nz 786w, https://miro.medium.com/max/828/0*qAIItnnv9VtcG9nz 828w, https://miro.medium.com/max/1100/0*qAIItnnv9VtcG9nz 1100w, https://miro.medium.com/max/1200/0*qAIItnnv9VtcG9nz 1200w" role="presentation" /></div><figcaption class="nu bl gn gl gm nv nw bm b bn bo cn"><strong class="bm lq">Figure 2.</strong> High variance of raw data size of 100 randomly sampled events; each bar represents a single dataset</figcaption></figure><p id="8127" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">There isn’t a fixed number of shuffle partitions that would work well for all events in the ingestion framework; if we pick a fixed number for all ingestion jobs, it might be too big for some jobs but too small for others, and both would result in low performance. While we were exploring different solutions to tune shuffle partition parameters, we found that Adaptive Query Execution could be a perfect solution.</p><h2 id="f740" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">How does AQE help?</h2><p id="f486" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">In Spark 3.0, the AQE framework ships with several key features, including dynamically switching join strategies and dynamically optimizing skew joins. However, the most critical new feature for our use case is dynamically coalescing shuffle partitions, which ensures that each Spark task operates on roughly the same amount of data. It does this by combining adjacent small partitions into bigger partitions at runtime. Since shuffle data can dynamically grow or shrink between different stages of a job, AQE is continually re-optimizing the size of each partition through coalescing throughout a job’s lifetime. This brought a great performance boost.</p><p id="3670" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">AQE handles all cases in our data ingestion framework well, including edge cases of spiky events and new events. One note is that flattening of nested columns and compression of file storage format (in our case, Parquet GZIP) might generate fairly small output files for small task splits. To ensure output file sizes are large enough to be efficiently accessed, we can increase the AQE advisory shuffle partition size accordingly.</p><h2 id="0aab" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">AQE Tuning Experience</h2><p id="c29c" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Let’s walk through an example to get a better understanding of AQE and its tuning experience. Say we run the example query to load one dataset. The query has one Map stage to flatten events and another Reduce stage to handle deduplication. After adopting AQE and running the job in Spark, we can see two highlighted steps get added to the physical plan.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm nz"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/1*XZWObyyGiQUBqgKNP7HSQw.png" width="700" height="211" srcset="https://miro.medium.com/max/640/1*XZWObyyGiQUBqgKNP7HSQw.png 640w, https://miro.medium.com/max/720/1*XZWObyyGiQUBqgKNP7HSQw.png 720w, https://miro.medium.com/max/750/1*XZWObyyGiQUBqgKNP7HSQw.png 750w, https://miro.medium.com/max/786/1*XZWObyyGiQUBqgKNP7HSQw.png 786w, https://miro.medium.com/max/828/1*XZWObyyGiQUBqgKNP7HSQw.png 828w, https://miro.medium.com/max/1100/1*XZWObyyGiQUBqgKNP7HSQw.png 1100w, https://miro.medium.com/max/1400/1*XZWObyyGiQUBqgKNP7HSQw.png 1400w" role="presentation" /></div></div><figcaption class="nu bl gn gl gm nv nw bm b bn bo cn"><strong class="bm lq">Figure 3.</strong> Change of physical plan of the example Spark job</figcaption></figure><p id="f575" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Now let’s take a closer look at our tuning phase. As shown in Table 1, we went through several iterations of param setting. From our experience, if the actual shuffle partition used is equal to the initial partition number we set, we should increase the initial partition number to split initial tasks more and get them coalesced. And if the average output file size is too small, we can increase the advisory partition size to generate larger shuffle partitions, and thus larger output files. Upon inspecting shuffle data of each task, we could also decrease executor memory and the max number of executors.</p><p id="4a9d" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">We also experimented with the tuned job parameters on datasets of different sizes, as shown in Table 2 and 3. From the results, we can see that once tuned, AQE performs well on datasets from zero bytes size to TB in size, all while using a single set of job parameters.¹</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm oa"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*eZu1Q9D0lyD5ZEVE" width="700" height="237" srcset="https://miro.medium.com/max/640/0*eZu1Q9D0lyD5ZEVE 640w, https://miro.medium.com/max/720/0*eZu1Q9D0lyD5ZEVE 720w, https://miro.medium.com/max/750/0*eZu1Q9D0lyD5ZEVE 750w, https://miro.medium.com/max/786/0*eZu1Q9D0lyD5ZEVE 786w, https://miro.medium.com/max/828/0*eZu1Q9D0lyD5ZEVE 828w, https://miro.medium.com/max/1100/0*eZu1Q9D0lyD5ZEVE 1100w, https://miro.medium.com/max/1400/0*eZu1Q9D0lyD5ZEVE 1400w" role="presentation" /></div></div><figcaption class="nu bl gn gl gm nv nw bm b bn bo cn"><strong class="bm lq">Table 1.</strong> Tuning AQE using example medium-size dataset</figcaption></figure><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm ob"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*K6Ff5uiN90I_ci_o" width="700" height="215" srcset="https://miro.medium.com/max/640/0*K6Ff5uiN90I_ci_o 640w, https://miro.medium.com/max/720/0*K6Ff5uiN90I_ci_o 720w, https://miro.medium.com/max/750/0*K6Ff5uiN90I_ci_o 750w, https://miro.medium.com/max/786/0*K6Ff5uiN90I_ci_o 786w, https://miro.medium.com/max/828/0*K6Ff5uiN90I_ci_o 828w, https://miro.medium.com/max/1100/0*K6Ff5uiN90I_ci_o 1100w, https://miro.medium.com/max/1400/0*K6Ff5uiN90I_ci_o 1400w" role="presentation" /></div></div><figcaption class="nu bl gn gl gm nv nw bm b bn bo cn"><strong class="bm lq">Table 2.</strong> Job stats of example small-size dataset</figcaption></figure><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm ob"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*-whu7KM8AK8w1N-f" width="700" height="215" srcset="https://miro.medium.com/max/640/0*-whu7KM8AK8w1N-f 640w, https://miro.medium.com/max/720/0*-whu7KM8AK8w1N-f 720w, https://miro.medium.com/max/750/0*-whu7KM8AK8w1N-f 750w, https://miro.medium.com/max/786/0*-whu7KM8AK8w1N-f 786w, https://miro.medium.com/max/828/0*-whu7KM8AK8w1N-f 828w, https://miro.medium.com/max/1100/0*-whu7KM8AK8w1N-f 1100w, https://miro.medium.com/max/1400/0*-whu7KM8AK8w1N-f 1400w" role="presentation" /></div></div><figcaption class="nu bl gn gl gm nv nw bm b bn bo cn"><strong class="bm lq">Table 3.</strong> Job stats of example empty-size dataset</figcaption></figure><p id="66ec" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">From our result, it’s clear that AQE can adjust the shuffle split size very close to our predefined value in the Reduce stage and thus generate outputs of target file size as we expect. Furthermore, since each shuffle split is close to predefined value, we can also lower executor memory from default values to ensure efficient resource allocation. As an additional big advantage to the framework, we do not need to do any special handling to onboard new datasets.</p><h1 id="f547" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Iceberg — Partition specs &amp; Compaction</h1><h2 id="3c6a" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">How does Iceberg help?</h2><p id="4952" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">In our data ingestion framework, we found that we could take advantage of Iceberg’s flexibility to define multiple partition specs to consolidate ingested data over time. Each data file written in a partitioned Iceberg table belongs to exactly one partition, but we can control the granularity of the partition values over time. Ingested tables write new data with an hourly granularity (ds/hr), and a daily automated process compresses the files on a daily partition (ds), without losing the hourly granularity, which later can be applied to queries as a residual filter.</p><p id="a424" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Our compaction process is smart enough to determine whether a data-rewrite is required to reach an optimal file size, otherwise just rewriting the metadata to assign the already existing data files to the daily partition. This has simplified the process for ingesting event data and provides a consolidated view of the data to the user within the same table. As an added benefit, we’ve realized cost savings in the overall process with this approach.</p><p id="38f6" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">As shown in the diagram below, in the consolidated Iceberg table we switch the partition spec from ds/hr to ds at the end of day. In addition, now user queries are easier to write and able to access fresher data with full history. Keeping only one copy of data also helps improve both compute and storage efficiencies and ensures data consistency.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div role="button" tabindex="0" class="li lj do lk ce ll"><div class="gl gm oc"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1400/0*Mrwb_94NNQWrs_mF" width="700" height="386" srcset="https://miro.medium.com/max/640/0*Mrwb_94NNQWrs_mF 640w, https://miro.medium.com/max/720/0*Mrwb_94NNQWrs_mF 720w, https://miro.medium.com/max/750/0*Mrwb_94NNQWrs_mF 750w, https://miro.medium.com/max/786/0*Mrwb_94NNQWrs_mF 786w, https://miro.medium.com/max/828/0*Mrwb_94NNQWrs_mF 828w, https://miro.medium.com/max/1100/0*Mrwb_94NNQWrs_mF 1100w, https://miro.medium.com/max/1400/0*Mrwb_94NNQWrs_mF 1400w" role="presentation" /></div></div><figcaption class="nu bl gn gl gm nv nw bm b bn bo cn"><strong class="bm lq">Figure 4.</strong> Change of table storage format for table consolidation</figcaption></figure><h2 id="6140" class="mr lp jg bm lq ms mt mu lu mv mw mx ly ko my mz mc ks na nb mg kw nc nd mk ne ga">Table Consolidation Experience</h2><p id="1161" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Consolidating hourly and daily data into one Iceberg table requires changes in both the write and read path. For the write path, to mitigate the aforementioned issues caused by small files, we force run a compaction during the partition spec switch. Tables 4 and 5 compare the statistics from our intelligent compaction jobs with the cost of a full rewrite of all the data files associated with the daily partition. For some large tables we obtain resource savings of &gt; 90% by leveraging Iceberg’s ability to avoid data copying during compaction.</p><figure class="ld le lf lg gx lh gl gm paragraph-image"><div class="gl gm od"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1298/0*yNW6DneHRxWZdlwq" width="649" height="291" srcset="https://miro.medium.com/max/640/0*yNW6DneHRxWZdlwq 640w, https://miro.medium.com/max/720/0*yNW6DneHRxWZdlwq 720w, https://miro.medium.com/max/750/0*yNW6DneHRxWZdlwq 750w, https://miro.medium.com/max/786/0*yNW6DneHRxWZdlwq 786w, https://miro.medium.com/max/828/0*yNW6DneHRxWZdlwq 828w, https://miro.medium.com/max/1100/0*yNW6DneHRxWZdlwq 1100w, https://miro.medium.com/max/1298/0*yNW6DneHRxWZdlwq 1298w" role="presentation" /></div><figcaption class="nu bl gn gl gm nv nw bm b bn bo cn"><strong class="bm lq">Table 4.</strong> Compaction job comparison of example small-size dataset</figcaption></figure><figure class="ld le lf lg gx lh gl gm paragraph-image"><div class="gl gm oe"><img alt="" class="ce lm ln c" src="https://miro.medium.com/max/1344/0*R8NW6NHCpI2OUUxm" width="672" height="293" srcset="https://miro.medium.com/max/640/0*R8NW6NHCpI2OUUxm 640w, https://miro.medium.com/max/720/0*R8NW6NHCpI2OUUxm 720w, https://miro.medium.com/max/750/0*R8NW6NHCpI2OUUxm 750w, https://miro.medium.com/max/786/0*R8NW6NHCpI2OUUxm 786w, https://miro.medium.com/max/828/0*R8NW6NHCpI2OUUxm 828w, https://miro.medium.com/max/1100/0*R8NW6NHCpI2OUUxm 1100w, https://miro.medium.com/max/1344/0*R8NW6NHCpI2OUUxm 1344w" role="presentation" /></div><figcaption class="nu bl gn gl gm nv nw bm b bn bo cn"><strong class="bm lq">Table 5.</strong> Compaction job comparison of example large-size dataset</figcaption></figure><p id="cd06" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">For the read path, since most data consumers use Airflow’s partition sensors, we updated the implementation of partition sensing. Specifically, we implemented a signal system to sense empty partitions in Iceberg tables, as opposed to the prior method of looking up each Hive partition as an actual row in Hive metastore.</p><h1 id="1e01" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Results</h1><p id="1408" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Comparing the prior TEZ and Hive stack, we see more than 50% compute resource saving and 40% job elapsed time reduction in our data ingestion framework with Spark 3 and Iceberg. From a usability standpoint, we made it simpler and faster to consume stored data by leveraging Iceberg’s capabilities for native schema and partition evolution.</p><h1 id="dd1c" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Conclusion</h1><p id="f81c" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">In this post, we shared the upgrades we applied to Airbnb’s data compute and storage tech stack. We hope that readers enjoyed learning how our event data ingestion framework benefits from Adaptive Query Execution and Iceberg and that they consider applying similar tech stack changes to their use cases involving datasets of varying size and time granularity.</p><p id="7f68" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">If this type of work interests you, please check out our open roles <a class="au lb" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">here</a>!</p><h1 id="e7b4" class="lo lp jg bm lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml ga">Acknowledgments</h1><p id="8593" class="pw-post-body-paragraph kd ke jg kf b kg mm ki kj kk mn km kn ko mo kq kr ks mp ku kv kw mq ky kz la iz ga">Special thanks to Bruce Jin, Guang Yang, Adam Kocoloski and Jingwei Lu for their continued guidance and support!</p><p id="6843" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">Also countless thanks to Mark Giangreco, Surashree Kulkarni and Shylaja Ramachandra for providing edits and great suggestions to the post!</p></div><div class="o dx of og ii oh" role="separator"><div class="iz ja jb jc jd"><p id="fceb" class="pw-post-body-paragraph kd ke jg kf b kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la iz ga">[1] One callout is that Spark AQE has a bug handling empty input (<a class="au lb" href="https://issues.apache.org/jira/browse/SPARK-35239" rel="noopener ugc nofollow" target="_blank">SPARK-35239</a>), and fixes are available in 3.2. Thus to take full advantage of AQE in lower Spark versions, we need to backport <a class="au lb" href="https://github.com/apache/spark/pull/32362" rel="noopener ugc nofollow" target="_blank">fix 1</a> and <a class="au lb" href="https://github.com/apache/spark/pull/31994" rel="noopener ugc nofollow" target="_blank">fix 2</a>.</p></div></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/upgrading-data-warehouse-infrastructure-at-airbnb-a4e18f09b6d5</link>
      <guid>https://medium.com/airbnb-engineering/upgrading-data-warehouse-infrastructure-at-airbnb-a4e18f09b6d5</guid>
      <pubDate>Mon, 26 Sep 2022 20:31:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[How Airbnb safeguards changes in production]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gi gj gk gl gm gn go gp gq gr l"><div class="o gs u"><div class="o"><div class="fd l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@bustamove89?source=post_page-----c83e94bfc52--------------------------------"><div class="l do"><img alt="Zack Loebel-Begelman" class="l ch ff gt gu fj" src="https://miro.medium.com/fit/c/48/48/1*lenZcXTeCkrc2yvhTJVycw.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bm b dm dn fu"><div class="gv o gw"><div><div class="ci" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@bustamove89?source=post_page-----c83e94bfc52--------------------------------">Zack Loebel-Begelman</a></div></div><div class="gx gy gz ha hb d"></div></div><div class="o ao hn"><p class="pw-published-date bm b bn bo cn">Sep 6</p><div class="ho ci" aria-hidden="true">·</div><div class="pw-reading-time bm b bn bo cn">8 min read</div></div></div></div><div class="o ao"><div class="h k hp hq hr"><div class="hs l fl"><div><div class="ci" aria-hidden="false"></div></div><div class="hs l fl"><div><div class="ci" aria-hidden="false"></div></div><div class="hs l fl"><div><div class="ci" aria-hidden="false"></div></div><div class="l fl"><div><div class="ci" aria-hidden="false"></div></div><div class="hv o ao"></div><div class="ck ib"></div></div></div><div class="ic id ie j i d"><div class="fd l"><div class="ij l fl"><div><div class="ci" aria-hidden="false"></div></div><div class="ij l fl"><div><div class="ci" aria-hidden="false"></div></div><div class="ij l fl"><div><div class="ci" aria-hidden="false"></div></div><div class="l fl"><div><div class="ci" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><h1 id="1bdb" class="jx jy ja bm jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku fu">Part II: Near Real-time Experiments</h1><p id="7d4c" class="pw-post-body-paragraph kv kw ja kx b ky kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls it fu">By: <a class="au lt" href="https://www.linkedin.com/in/michaelcl/" rel="noopener ugc nofollow" target="_blank">Mike Lin</a>, <a class="au lt" href="https://www.linkedin.com/in/preetiramasamy/" rel="noopener ugc nofollow" target="_blank">Preeti Ramasamy</a>, <a class="au lt" href="https://www.linkedin.com/in/toby-mao/" rel="noopener ugc nofollow" target="_blank">Toby Mao</a>, <a class="au lt" href="https://www.linkedin.com/in/zack-loebel-begelman-85407698/" rel="noopener ugc nofollow" target="_blank">Zack Loebel-Begelman</a></p><figure class="lv lw lx ly gr lz gf gg paragraph-image"><div role="button" tabindex="0" class="ma mb do mc ce md"><div class="gf gg lu"><img alt="" class="ce me mf" src="https://miro.medium.com/max/1400/1*TwziuVGxkiaD4XKu4A2-pA.jpeg" srcset="https://miro.medium.com/max/300/1*TwziuVGxkiaD4XKu4A2-pA.jpeg 300w" role="presentation" /></div></div></figure><p id="122a" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">In our <a class="au lt" rel="noopener" href="https://medium.com/airbnb-engineering/how-airbnb-safeguards-changes-in-production-9fc9024f3446">first post</a> we discussed the need for a near real time Safe Deploy system and some of the statistics that power its decisions. In this post we will cover the architecture and engineering choices behind the various components that Safe Deploys comprises.</p><p id="72dc" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">Designing a near real-time experimentation system required making explicit tradeoffs among speed, precision, cost, and resiliency. An early decision was to limit near real-time results to only the first 24 hours of an experiment — enough time to catch any major issues and transition to using comprehensive results from the batch pipeline. The idea being once batch results were available, experimenters would no longer need real time results. The following sections describe the additional design decisions in each component of the Safe Deploys system.</p><h1 id="baeb" class="jx jy ja bm jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku fu">High Level Design</h1><p id="6a47" class="pw-post-body-paragraph kv kw ja kx b ky kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls it fu">There are 3 major components that make up the technical footprint of the Safe Deploys system:</p><ol class=""><li id="3345" class="ml mm ja kx b ky mg lc mh lg mn lk mo lo mp ls mq mr ms mt fu"><strong class="kx jb">Ramp Controller</strong>, a <a class="au lt" href="https://flink.apache.org/" rel="noopener ugc nofollow" target="_blank">Flink</a> job that acts as a centralized coordinator, providing experiment configuration to NRT via Kafka and invoking statistical computations by calling Measured via HTTP.</li><li id="0c08" class="ml mm ja kx b ky mu lc mv lg mw lk mx lo my ls mq mr ms mt fu"><strong class="kx jb">Near Real Time (NRT) pipeline</strong>, another Flink job that extracts measures, joins and enriches those measures with assignment information (treatment and subject information), and stores the enriched measures into S3.</li><li id="73f9" class="ml mm ja kx b ky mu lc mv lg mw lk mx lo my ls mq mr ms mt fu"><strong class="kx jb">Measured</strong>, a python library (invoked via a Python HTTP server and worker pool) that consumes enriched measures from S3, aggregates them, and runs stats to determine if any change is significant.</li></ol><figure class="lv lw lx ly gr lz gf gg paragraph-image"><div role="button" tabindex="0" class="ma mb do mc ce md"><div class="gf gg mz"><img alt="" class="ce me mf" src="https://miro.medium.com/max/1400/0*dO4dDyRoREQgfsh9" srcset="https://miro.medium.com/max/300/0*dO4dDyRoREQgfsh9 300w" role="presentation" /></div></div><figcaption class="na bl gh gf gg nb nc bm b bn bo cn">Fig 1: Architecture Diagram of the Safe Deploy system</figcaption></figure><h1 id="1586" class="jx jy ja bm jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku fu">Ramp Controller</h1><p id="b939" class="pw-post-body-paragraph kv kw ja kx b ky kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls it fu">The Ramp Controller performs automated experiment ramping based on the results from Measured. It increases experiment exposure in stages, slowly exposing more subjects and monitoring metric impacts at each stage. If any egregiously negative metric is observed, the Ramp Controller will immediately shut down the experiment to minimize the impacts of bad changes. It supports several ramping algorithms, but most users leverage a simple time based algorithm.</p><figure class="lv lw lx ly gr lz gf gg paragraph-image"><div role="button" tabindex="0" class="ma mb do mc ce md"><div class="gf gg nd"><img alt="" class="ce me mf" src="https://miro.medium.com/max/1400/0*Uu8Z0bAhhrDH9CSQ" srcset="https://miro.medium.com/max/300/0*Uu8Z0bAhhrDH9CSQ 300w" role="presentation" /></div></div><figcaption class="na bl gh gf gg nb nc bm b bn bo cn">Figure 2: Ramping process</figcaption></figure><p id="58e4" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">Ramp Controller was designed to be stateless, and resilient to any job failures. Within seconds of an experiment starting, it publishes metadata to Kafka, triggering NRT to start joining events for that experiment. The metadata includes a path in S3 that NRT will write to. At this point the Ramp Controller’s core loop will begin:</p><figure class="lv lw lx ly gr lz gf gg paragraph-image"><div role="button" tabindex="0" class="ma mb do mc ce md"><div class="gf gg ne"><img alt="" class="ce me mf" src="https://miro.medium.com/max/1400/1*ypoetTDVqKYg0ID8_EpTNA.png" srcset="https://miro.medium.com/max/300/1*ypoetTDVqKYg0ID8_EpTNA.png 300w" role="presentation" /></div></div></figure><p id="c8cd" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">Results are computed for the first 24 hours of the experiment, with new metrics consumed as new files are published to S3. A metric is marked egregious when the percent change is smaller than -20% with an adjusted p-value of less than or equal to 0.01. By leveraging the Measured framework for metric computation, we get custom aggregations, richer statistical models, and the ability to compute performance metrics, and dimensional cuts for metrics.</p><p id="ddc8" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">After overcoming these technical challenges in scaling the pipeline and tuning decision making, we were ready to vet the system with experiment owners and drive adoption.</p><h1 id="75e7" class="jx jy ja bm jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku fu">Near Real Time (NRT) Pipeline</h1><p id="c268" class="pw-post-body-paragraph kv kw ja kx b ky kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls it fu">We built the new NRT pipeline in Java and Scala using Apache Flink. It reads from a multitude of Kafka streams: event streams containing raw user based events (impressions, booking requests etc.), a stream that the Ramp Controller emits containing experiment metadata, and the streaming that contains the raw assignment events emitted for all experiments which are also consumed by the batch pipeline.</p><p id="f443" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">Previously Airbnb had attempted to build an online data store for all experiment assignments, however this did not scale and was eventually shut down. By reducing the scope, specifically limiting the NRT pipeline to the first 24 hours of an experiment, we are able to store a bounded subset of assignments. Using a <a class="au lt" href="https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/broadcast_state/" rel="noopener ugc nofollow" target="_blank">broadcast join</a> of experiment metadata lets us filter the assignment events and Flink makes <a class="au lt" href="https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#state-time-to-live-ttl" rel="noopener ugc nofollow" target="_blank">aging out data</a> trivial.</p><p id="edef" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">The extraction is written in a stand-alone library so that the measure definitions can be re-used in both batch and streaming. In order to be highly performant, the measure extraction determines which events to extract first using an inverted index based on the existence and values of json fields then only running extraction on the relevant events. Not only do we extract measures, but also dimensions from each event. Because we want to limit the complexity in this job, we only support dimensions from the same event as a measure itself.</p><p id="fd86" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">Our first difficulty was in how to handle measures and assignments coming in out of order. We want data to age out at different times when joining assignment events to measures and assignment data should be stored for the full 24 hours. Because the volume of measure events means we can’t keep them for 24 hours, we keep a short buffer dropping measures after 5 minutes. The outer join required to achieve this goal required building a custom join using the <a class="au lt" href="https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/functions/co/KeyedCoProcessFunction.html" rel="noopener ugc nofollow" target="_blank">keyed co-process api</a>.</p><p id="b2cb" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">Once the data is joined we buffer it internally within Flink to reduce the total number of files for small experiments. We wrote a simple keyed process stage that hashes the events based on the timestamp against how many concurrent files we want to output. It’s important that we hash on the timestamp since Flink requires the keying mechanism to be deterministic. The events are buffered based on event counts and time, emitting the buffered list either once a partition hits an event based or time based threshold. This stage allows us to have more fine grained control over the number of files we output.</p><p id="dc83" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">We leverage Flink’s <a class="au lt" href="https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/filesystem/#parquet-format" rel="noopener ugc nofollow" target="_blank">built in support for parquet and S3 as a file sink</a> to write the files. In order to provide exactly-once semantics, Flink will only write files when checkpointing occurs. Files output by the NRT pipeline are consumed by the ramp controller to make decisions. To keep our latency low, we checkpoint every 5 minutes.</p><h1 id="fefe" class="jx jy ja bm jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku fu">Measured</h1><p id="7ef9" class="pw-post-body-paragraph kv kw ja kx b ky kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls it fu">Measured is a framework for defining and computing metrics. It consists of a Scala library for extracting measures and dimensions from raw events that the NRT pipeline leverages, and a Python library for defining metrics (based off of those measures), statistical models, and visualizations. This section focuses on the Python library, and how it is used to compute metrics.</p><p id="dc36" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">In order to provide consistent results across platforms we run the same Measured jobs that user’s run via a Python HTTP job server and worker pool. The NRT metric evaluation is one of those jobs, it downloads the event files from S3 using <a class="au lt" href="https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor" rel="noopener ugc nofollow" target="_blank">a Python worker pool</a>. Once the files are downloaded the job leverages <a class="au lt" href="https://duckdb.org/" rel="noopener ugc nofollow" target="_blank">duckdb</a>’s <a class="au lt" href="https://duckdb.org/docs/data/parquet" rel="noopener ugc nofollow" target="_blank">parquet reader functionality</a> to aggregate to the user level. Once we have a local user aggregate the job evaluates the various sequential models discussed in the first post. The results of these evaluations are stored in a MySQL database upon job completion to be retrieved by the UI or the Ramp Controller over HTTP.</p><h1 id="590a" class="jx jy ja bm jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku fu">Adoption</h1><p id="3ab7" class="pw-post-body-paragraph kv kw ja kx b ky kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls it fu">The full vision of Safe Deploys encompassed safeguarding any changes in production. However, to gain experience and trust, we initially focused our efforts on A/B tests. We knew that Safe Deploys, like any anomaly detection system, especially one that automates remediation steps, would face certain challenges in adoption, including:</p><ul class=""><li id="713c" class="ml mm ja kx b ky mg lc mh lg mn lk mo lo mp ls nf mr ms mt fu">Trust in NRT metrics that were similar but not exactly the same as existing batch ones</li><li id="6093" class="ml mm ja kx b ky mu lc mv lg mw lk mx lo my ls nf mr ms mt fu">Relinquishment of control in ramping and shutdown of experiments</li><li id="43ee" class="ml mm ja kx b ky mu lc mv lg mw lk mx lo my ls nf mr ms mt fu">False positives that could slow down experimentation by forcing restarts</li></ul><p id="11e8" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">Before Safe Deploys, nearly a quarter of Airbnb teams had a manual process for ramping up experiments. This consisted of increasing the exposure of an experiment, manually verifying performance metrics, and repeating until reaching a target exposure. This often masked significant but not visually obvious negative impacts.</p><figure class="lv lw lx ly gr lz gf gg paragraph-image"><div role="button" tabindex="0" class="ma mb do mc ce md"><div class="gf gg ng"><img alt="" class="ce me mf" src="https://miro.medium.com/max/1400/0*B_kTvecFqvuTWE17" srcset="https://miro.medium.com/max/300/0*B_kTvecFqvuTWE17 300w" role="presentation" /></div></div><figcaption class="na bl gh gf gg nb nc bm b bn bo cn">Figure 3: Illustration of how a controlled experiment provides greater sensitivity</figcaption></figure><p id="33bf" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">We evangelized Safe Deploys as complementary to the existing process, providing increased sensitivity of detecting negative impacts, while still allowing experimenters to stop an experiment based on their own monitoring at any time. We also continually improved statistical methods used to decrease false positives and negatives. Since enabling Safe Deploys by default a year ago, it has been used for over 85% of experiment starts and helped prevent dozens of incidents, and flagged misconfigurations early, minimizing negative impacts to Airbnb’s business and wasted engineering resources on remediation.</p><h1 id="a4ec" class="jx jy ja bm jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku fu">Tip of the Iceberg</h1><p id="02b8" class="pw-post-body-paragraph kv kw ja kx b ky kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls it fu">Safeguarding experiments was a significant step towards reducing incidents at Airbnb, however the full vision encompasses changes originating from other channels. The distribution of changes across different channels can be found in Figure 4.</p><figure class="lv lw lx ly gr lz gf gg paragraph-image"><div role="button" tabindex="0" class="ma mb do mc ce md"><div class="gf gg nh"><img alt="" class="ce me mf" src="https://miro.medium.com/max/1400/0*7Il9QJd_KJZcnBHp" srcset="https://miro.medium.com/max/300/0*7Il9QJd_KJZcnBHp 300w" role="presentation" /></div></div><figcaption class="na bl gh gf gg nb nc bm b bn bo cn">Figure 4: Distribution of changes pushed to production by channel</figcaption></figure><p id="0fba" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">We tackled each remaining channel differently:</p><ul class=""><li id="0ab8" class="ml mm ja kx b ky mg lc mh lg mn lk mo lo mp ls nf mr ms mt fu">Feature Flags were unified with Experiments to gain Safe Deploys capabilities</li><li id="dc21" class="ml mm ja kx b ky mu lc mv lg mw lk mx lo my ls nf mr ms mt fu">Content management systems were provided APIs to programmatically create experiments tied to content changes, and ramped with Safe Deploys</li><li id="6a8f" class="ml mm ja kx b ky mu lc mv lg mw lk mx lo my ls nf mr ms mt fu">Code Deploys through Spinnaker ran Safe Deploys alongside pre-existing Automated Canary Analysis with each deploy</li></ul><p id="7972" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">(We considered infrastructure configs out of scope, since these lower level changes would require a fundamentally different approach to address.)</p><p id="b275" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">Code Deploys with Spinnaker account for an outsized majority of changes in production and required extensive work to enable. The next post in this series will cover how we achieved this through changes across traffic routing, service configs, and dynamically created configurations for Spinnaker.</p><p id="7945" class="pw-post-body-paragraph kv kw ja kx b ky mg la lb lc mh le lf lg mi li lj lk mj lm ln lo mk lq lr ls it fu">Interested in working at Airbnb? Check out our <a class="au lt" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">open roles</a>.</p><h1 id="4773" class="jx jy ja bm jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku fu">Acknowledgements</h1><p id="4173" class="pw-post-body-paragraph kv kw ja kx b ky kz la lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls it fu">Safe Deploys was only made possible through the combined efforts across Airbnb’s infrastructure and data science teams. We would like to thank <a class="au lt" href="https://www.linkedin.com/in/jingwei-lu-5701222/" rel="noopener ugc nofollow" target="_blank">Jingwei Lu</a>, <a class="au lt" href="https://www.linkedin.com/in/wei-hou-93a069a4/" rel="noopener ugc nofollow" target="_blank">Wei Ho</a>, and the rest of the Stream Infrastructure team for helping implement, and subsequently scale the NRT pipeline. Also, thanks to Candace Zhang, <a class="au lt" href="https://www.linkedin.com/in/erikriverson/" rel="noopener ugc nofollow" target="_blank">Erik Iverson</a>, <a class="au lt" href="https://www.linkedin.com/in/minyong-lee-1a302466/" rel="noopener ugc nofollow" target="_blank">Minyong Lee</a>, Reid Andersen, <a class="au lt" href="https://www.linkedin.com/in/shant-torosean-606aa354/" rel="noopener ugc nofollow" target="_blank">Shant Toronsean</a>, <a class="au lt" href="https://www.linkedin.com/in/tatiana-xifara/" rel="noopener ugc nofollow" target="_blank">Tatiana Xifara</a>, and the many data scientists that helped build out metrics and verify their correctness. Also thanks to <a class="au lt" href="https://www.linkedin.com/in/kodnous/" rel="noopener ugc nofollow" target="_blank">Kate Odnus</a>, <a class="au lt" href="https://www.linkedin.com/in/kedar-bellare-3048128a/" rel="noopener ugc nofollow" target="_blank">Kedar Bellare</a> and <a class="au lt" href="https://www.linkedin.com/in/pmaccart/" rel="noopener ugc nofollow" target="_blank">Phil MacCart</a>, who were early adopters and provided us invaluable feedback. In addition <a class="au lt" href="https://www.linkedin.com/in/kocolosk/" rel="noopener ugc nofollow" target="_blank">Adam Kocoloski</a>, <a class="au lt" href="https://www.linkedin.com/in/rstata/" rel="noopener ugc nofollow" target="_blank">Raymie Stata</a> and <a class="au lt" href="https://www.linkedin.com/in/ronnyk/" rel="noopener ugc nofollow" target="_blank">Ronny Kohavi</a> for championing the effort across the company. We would also like to thank other members of the ERF team that contributed to Safe Deploys: <a class="au lt" href="https://www.linkedin.com/in/adriankuhn/" rel="noopener ugc nofollow" target="_blank">Adrian Kuhn</a>, <a class="au lt" href="https://www.linkedin.com/in/antoinecreux/" rel="noopener ugc nofollow" target="_blank">Antoine Creux</a>, <a class="au lt" href="https://www.linkedin.com/in/george-l-9b946655/" rel="noopener ugc nofollow" target="_blank">George Li</a>, <a class="au lt" href="https://www.linkedin.com/in/krishna-bhupatiraju-1ba1a524/" rel="noopener ugc nofollow" target="_blank">Krishna Bhupatiraju</a>, <a class="au lt" href="https://www.linkedin.com/in/shao-xie-0b84b64/" rel="noopener ugc nofollow" target="_blank">Shao Xie</a>, and <a class="au lt" href="https://www.linkedin.com/in/vincent-chan-70080423/" rel="noopener ugc nofollow" target="_blank">Vincent Chan</a>.</p><blockquote class="ni nj nk"><p id="1ffd" class="kv kw nl kx b ky mg la lb lc mh le lf nm mi li lj nn mj lm ln no mk lq lr ls it fu">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</p></blockquote></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/how-airbnb-safeguards-changes-in-production-c83e94bfc52</link>
      <guid>https://medium.com/airbnb-engineering/how-airbnb-safeguards-changes-in-production-c83e94bfc52</guid>
      <pubDate>Tue, 06 Sep 2022 19:16:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey to Airbnb — Veerabahu Chandran]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gp gq gr gs gt gu gv gw gx gy l"><div class="o gz u"><div class="o"><div class="fk l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=post_page-----70468aa3bc06--------------------------------"><div class="l dp"><img alt="Lauren Mackevich" class="l ci fm ha hb fq" src="https://miro.medium.com/fit/c/96/96/0*-imhApAGWwgM89i1.jpg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b dn do gb"><div class="hc o hd"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@lauren.mackevich?source=post_page-----70468aa3bc06--------------------------------">Lauren Mackevich</a></div></div><div class="he hf hg hh hi d"></div></div><div class="o ao hu"><p class="pw-published-date bn b bo bp co">Aug 18</p><div class="hv cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">5 min read</div></div></div></div><div class="o ao"><div class="h k hw hx hy"><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="ic o ao"></div><div class="cl ii"></div></div></div><div class="ij ik il j i d"><div class="fk l"><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ja jb jc jd je"><div class=""><p id="8af2" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Learning and growing in Airbnb’s new Bangalore Tech Center</p><figure class="ld le lf lg gy lh gm gn paragraph-image"><div role="button" tabindex="0" class="li lj dp lk cf ll"><div class="gm gn lc"><img alt="" class="cf lm ln" src="https://miro.medium.com/max/1400/1*wwf3CMkjhKPlaxichQJd1g.jpeg" width="700" height="467" role="presentation" /></div></div></figure><p id="5b46" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><em class="lo">Veera Chandran is an engineer in Airbnb’s new Bangalore Tech Center, where his team builds out technical systems to support hosts. As a lifelong learner, he has a passion for exploring new technologies and diving into practical problems. He’s excited to be tackling both the technical challenges of building new architecture and the organizational challenges of building out the capabilities of a new office.</em></p><p id="420d" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><em class="lo">Here’s Veera’s story:</em></p><h1 id="711c" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Learning and exploring</h1><p id="4df4" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">I grew up in Tamil Nadu, in the South of India. I was always a curious kid, trying to understand how everything worked, so when it came to choosing a course of study, engineering was a natural fit. I feel lucky that I had a lot of education opportunities in front of me, and I was able to choose the path I wanted to take.</p><p id="4232" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">My first exposure to computers came when I was in 8th grade. These were still the relatively early days of the computer, and my dad brought one home so he could learn to use it. I found it fascinating and learned BASIC and Logo. These are simple languages, but I was excited that I could use them to make drawings and text appear on the screen. Those rudimentary programs opened me to the world of what’s possible with computer science.</p><h1 id="f1eb" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Studying and practical experience</h1><p id="5287" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">I went to the College of Engineering, Guindy to study Computer Science and Engineering. My studies covered a lot of subjects, but the one that excited me most was networking. I was really curious to understand how data moves from one place to another.</p><p id="4e4e" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">My first practical networking experience came while still in school. There were a bunch of gamers I knew, and they wanted to set up a LAN to play <em class="lo">Age of Empires</em> and <em class="lo">Quake </em>together. I got together with a couple of my friends and built out an inexpensive networking solution for them, covering everything from cabling to routers to setting up configurations. I’m actually not that big of a gamer myself, but building out a network was really exciting for me. I love to understand things on a practical level, because while theoretical understanding is important, I always find it most meaningful to see how things actually work.</p><h1 id="9445" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">The power of engineering</h1><p id="73f7" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">After graduation, I got a job in networking. There were several interesting companies in the space at the time, and I ended up joining one run by a group of IIT (Indian Institute of Technology) professors. It was a great opportunity to learn from some of the brightest minds in the field. I always tried to make sure I was learning, so I sought out whatever would help me continue to grow. Eventually, I moved on to opportunities at larger internet companies where I could use my networking knowledge but also expand into topics like large-scale, distributed systems.</p><p id="9791" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Being a software engineer has always been exciting to me because it gives you the power to solve so many problems. When my daughter was born, my wife and I were looking for names that had to fit multiple constraints–e.g. it had to start with a <em class="lo">specific </em>letter–and it was a struggle to come up with viable options. As an engineer, I realized there was an easier way to find all our choices. I wrote a quick program that downloaded a list of millions of names and then ran them through our criteria. From that, I was able to narrow it down to a list of a few thousand options. My wife was amazed that I could generate so many names with just a couple hours of work.</p><p id="7d06" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">The challenges of engineering are also interesting. You have to work hard to keep yourself informed. The industry moves so fast. When I started, I was using Java 4, and now we’re on Java 18. The way you would solve a problem in either of these versions is so different. All these newer languages have also emerged, and you can apply each to different situations. It feels like every day new machine learning research pushes the boundaries in unimaginable ways. I don’t know what’s going to come next, but I know it’s going to evolve quickly.</p><h1 id="797a" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Finding impact at Airbnb</h1><p id="924d" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">After a while in my previous role, I began to feel like my learning curve was getting saturated, so I wanted to look for a new challenge somewhere I could have a larger impact. I heard Airbnb was opening a tech center in Bangalore, and I was excited by the opportunity to be one of the first engineers there.</p><p id="6141" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">I joined in September 2021 as the first engineer in the Hosting org in Bangalore. I focus on tools for compliance, which is a complex problem space. Every region has their own laws on short-term rentals that hosts have to follow, and the laws can vary at different levels — the US has their laws, and then California might customize some of them, and San Francisco might have their own on top of that, and so on. These laws can also change quickly, like they did during Covid, so our products need to be versatile and adapt to new conditions.</p><p id="16b2" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Airbnb has been a great place to work. It’s startup-y in that there are challenging technical problems to work on, but the job is stable and the company respects your work-life balance. As a technical leader, there’s a great opportunity to be part of the evolution of our technology roadmap. The architecture recently transitioned from a monorepo to a Service Oriented Architecture, so we’re still figuring out the best approaches for the problems we’re solving.</p><p id="ce86" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">I’ve also appreciated Airbnb’s culture, especially the focus on inclusivity and belonging. My coworkers want to make everyone feel comfortable. When they introduced themselves to me, they all included their pronouns, making it easier for anyone else to share theirs. The people here live the culture and make everyone feel included.</p><h1 id="ff39" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Building our office in Bangalore</h1><p id="cb57" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">One of my favorite things about working as an engineer in the Bangalore office is the ownership and accountability. We’re not just a delivery center, where we’re being passed requirements from elsewhere and building that one piece of software. We like to call ourselves a capability center. Our team is tasked with the whole span of product development, from identifying what user problems exist all the way through to delivering a solution for them. We work on the same roadmap, codebase, and tech stack as Airbnb HQ.</p><p id="2d53" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Our team is growing quickly, both in Bangalore and remotely across India. With the team being spread out, trust and team-building have been important. We have a social meeting every Friday, and the whole team shows up so we can get to know one another. It’s great for connecting with teammates, and the trust we’re creating helps us build more successful products.</p><p id="43af" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Airbnb leadership has a clear roadmap for the future of the Bangalore Tech Center, and the team is growing quickly. It’s been exciting to build our first tech center outside of headquarters. We’re hiring for a number of teams and we’d love to hear from you!</p><p id="9800" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Check out these related roles based out of Bangalore:</p><p id="14ab" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><a class="au ms" href="https://grnh.se/777f0dbd1us" rel="noopener ugc nofollow" target="_blank">Engineering Manager, Ambassador Platform Products</a></p><p id="463e" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><a class="au ms" href="https://grnh.se/9b78e7f21us" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Ambassador Platforms</a></p><p id="2bac" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><a class="au ms" href="https://grnh.se/e0c9d3761us" rel="noopener ugc nofollow" target="_blank">Manager, BizTech</a></p><p id="7c18" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><a class="au ms" href="https://grnh.se/d43963981us" rel="noopener ugc nofollow" target="_blank">Senior Software Engineer, Cities</a></p><p id="67ee" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><a class="au ms" href="https://grnh.se/6a500ddd1us" rel="noopener ugc nofollow" target="_blank">Sr. Analytics &amp; Insight Analyst: CSA</a></p><p id="ab36" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><a class="au ms" href="https://grnh.se/b6de7b661us" rel="noopener ugc nofollow" target="_blank">Operations Engineer, Biztech</a></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-veerabahu-chandran-70468aa3bc06</guid>
      <pubDate>Thu, 18 Aug 2022 20:50:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Sisyphus and the CVE Feed: Vulnerability Management at Scale]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gp gq gr gs gt gu gv gw gx gy l"><div class="o gz u"><div class="o"><div class="fk l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@keziahp?source=post_page-----e2749f86a7a4--------------------------------"><div class="l dp"><img alt="Keziah Plattner" class="l ci fm ha hb fq" src="https://miro.medium.com/fit/c/96/96/1*MP5Ehc3nxVUIPgjvMtkIIQ@2x.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b dn do gb"><div class="hc o hd"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@keziahp?source=post_page-----e2749f86a7a4--------------------------------">Keziah Plattner</a></div></div><div class="he hf hg hh hi d"></div></div><div class="o ao hu"><p class="pw-published-date bn b bo bp co">Aug 10</p><div class="hv cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">13 min read</div></div></div></div><div class="o ao"><div class="h k hw hx hy"><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="ic o ao"></div><div class="cl ii"></div></div></div><div class="ij ik il j i d"><div class="fk l"><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ja jb jc jd je"><div class=""><figure class="gq gs kf kg kh ki gm gn paragraph-image"><div role="button" tabindex="0" class="kj kk dp kl cf km"><div class="gm gn ke"><img alt="" class="cf kn ko" src="https://miro.medium.com/max/1400/0*McONkEbdjlQOssDo" width="700" height="467" role="presentation" /></div></div></figure><p id="c881" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Authors</strong><a class="au ln" href="https://www.linkedin.com/in/keziahsonderplattner" rel="noopener ugc nofollow" target="_blank">Keziah Perez Sonder Plattner</a>, Senior Software Engineer<a class="au ln" href="https://www.linkedin.com/in/kadia-m-58a18328" rel="noopener ugc nofollow" target="_blank">Kadia Mashal</a>, Engineering Manager</p><h1 id="4c3f" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Introduction</h1><p id="610a" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Every engineer knows that security is a never-ending problem. Until we delete all our code and move into a cottage in the woods, we have to accept that there is no such thing as 100% secure software. You could be doing everything perfectly, and a publicly known vulnerability (<a class="au ln" href="https://www.redhat.com/topics/security/what-is-cve" rel="noopener ugc nofollow" target="_blank">CVE</a>) could emerge for the most updated version of a third party library in your infrastructure. Things are secure until they are not. Like with Sisyphus, the boulder will never reach the top of the hill.</p><p id="d459" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Rather than eliminating vulnerabilities, the goal of a vulnerability management program should be to quickly and effectively detect and respond to the barrage of threats that surface every day. There are many scanners and vendor tools that purport to solve the problem. But with the scanners comes the problem of a never-ending flood of CVE reports, thus slowing down our ability to remediate in a timely manner.</p><figure class="ms mt mu mv gy ki gm gn paragraph-image"><div class="gm gn mr"><img alt="" class="cf kn ko" src="https://miro.medium.com/max/832/0*m80-F_w_BxVFZo0_" width="416" height="416" role="presentation" /></div></figure><h1 id="e164" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Vulnerability Management Lifecycle</h1><p id="016d" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">If you are new to vulnerability management, here are the basics of the lifecycle.</p><figure class="ms mt mu mv gy ki gm gn paragraph-image"><div role="button" tabindex="0" class="kj kk dp kl cf km"><div class="gm gn mw"><img alt="" class="cf kn ko" src="https://miro.medium.com/max/1400/0*UxDN236N9MBuYvDh" width="700" height="440" role="presentation" /></div></div><figcaption class="mx bm go gm gn my mz bn b bo bp co"><em class="na">Fig. 1: The Vulnerability Management Lifecycle</em></figcaption></figure><p id="f86e" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Detection</strong></p><p id="ad9d" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Find potential vulnerabilities in our infrastructure, anywhere from CVEs to insecure misconfigurations.</p><p id="6526" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Risk Assessment</strong></p><p id="c66a" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Apply a risk framework to the findings to identify true positives and weed out non-applicable vulnerabilities.</p><p id="c658" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Reporting</strong></p><p id="1917" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Find the team and/or person best suited to address it and track progress in a methodical way. In addition, centrally track all vulnerabilities in order to have a full view of our attack surface.</p><p id="d928" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Remediation and Prevention</strong></p><p id="cc0a" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Promptly remediate the vulnerability and invest in work to prevent the vulnerability from being introduced in the first place.</p><h1 id="1492" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Objectives</h1><p id="f514" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">In building a vulnerability management program, we want to focus on the following:</p><ul class=""><li id="bd3e" class="nb nc jh kr b ks kt kw kx la nd le ne li nf lm ng nh ni nj gb"><strong class="kr ji">Visualize Known Attack Surface</strong>: We can’t properly assess risk if we don’t know our vulnerability status in the first place.</li><li id="d208" class="nb nc jh kr b ks nk kw nl la nm le nn li no lm ng nh ni nj gb"><strong class="kr ji">Speed</strong>: Detection, reporting, and remediation should be completed in a timely manner.</li><li id="ad99" class="nb nc jh kr b ks nk kw nl la nm le nn li no lm ng nh ni nj gb"><strong class="kr ji">Prioritization</strong>: Focus on the highest-priority vulnerabilities before tackling less important or harder to exploit ones.</li><li id="6f06" class="nb nc jh kr b ks nk kw nl la nm le nn li no lm ng nh ni nj gb"><strong class="kr ji">Scaling</strong>: Support a constantly evolving and expanding cloud infrastructure.</li></ul><h1 id="cfd3" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Gaps and Challenges with Standard Industry Solutions</h1><p id="ace0" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Standard industry advice leans on out-of-the-box vendor deployments, manual risk assessment, and operationally-heavy reporting processes. Automation, if it exists, relies on limited vendor functionality without the flexibility to adjust to unique attributes of our environment. This led to major challenges in accomplishing our objectives.</p><p id="ec77" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Lack of Vendor Agnostic Solutions</strong></p><p id="5633" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">As our infrastructure expands, the number of vulnerability types we want to track grows along with it. A variety of scanning solutions are needed to cover our bases, but they come with different setups and reporting processes. And there may be future scanning solutions that will work better for us, so we don’t want to lock ourselves into a single vendor or solution.</p><p id="0847" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Noisy and Inaccurate Severity Ratings</strong></p><p id="6a33" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">In our experience, the majority of scanners provide inaccurate risk scoring. Vulnerability bulletins and assessments like the basic Common Vulnerability Scoring System (CVSS) may describe a worst-case scenario that is difficult to exploit or sensationalize an issue, leading to inflated severity. And when a major zero-day vulnerability is found, it’s more likely to be identified by an anonymous Twitter user than a scanner.</p><p id="5d4e" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Additionally, internal mitigations can lower the impact of a vulnerability, and generic scanners rarely have ways to add internal context to customize risk ratings. Asset information and the location of the vulnerability play a massive role in determining its severity.</p><p id="3fc8" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Operational Work</strong></p><p id="1601" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Many vulnerability management solutions assume the need for human intervention in the process. However, humans make mistakes, and every manual step leads to slower remediation times. Spending time on onerous tasks like manually assessing risk severity or making tickets takes away from our time to focus on root-cause remediation.</p><h1 id="98d4" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Guiding Principles</h1><p id="6d9a" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Before developing our solution, we wanted to establish guiding principles. While there will always be exceptions, our goal is to keep this as our north star.</p><p id="3ca3" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Limit the need for human intervention by reducing false positives. </strong>It can be tempting to design a solution that catches close to 100% of true positives.</p><p id="272b" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">However, when maximizing true positives, it’s inevitable that the false positive rate will increase too. False positives create onerous manual work for both the security team and the owning engineering team. We don’t want to “cry wolf” on vulnerabilities that are not worth addressing. Doing so is both unscalable and breaks trust in the security org. Not to mention that relying on manual triage to verify alert severity slows down response time, leaving vulnerabilities open for longer.</p><p id="a160" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">In addition, the idea that manually checking vulnerabilities will result in higher accuracy doesn’t take into account human error and alert fatigue. No human or automated process will ever get 100% true positive accuracy, and that knowledge must be built into the solution instead of fighting a losing battle against the barrage of noisy alerts.</p><p id="870a" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Pair detection with preventative measures.</strong>Now that we have established that true positives will slip through the cracks, we need to address how to handle those cases. We pair our detection and reporting workflows with preventative solutions that address the root cause of the vulnerabilities. For example, if we make sure to have a regular patch cadence, then all vulnerabilities–including lower-priority ones–will be addressed within a reasonable timeframe. If we fix a flawed design, we will reduce the number of vulnerabilities in the first place.</p><p id="ec38" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Build relationships.</strong></p><p id="4e3a" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">All the automation in the world won’t be useful if we antagonize other engineering teams, rather than empowering and incentivizing them to remediate problems. Developer productivity matters — we don’t want to create an onerous system that frustrates developers and makes security the enemy. We want to avoid blocking solutions unless the priority calls for it, and we want to focus engineering efforts on our highest priority gaps rather than spreading engineers thin across many low and medium level vulnerabilities. Of course, there are always exceptions here–sometimes a vulnerability is severe enough to require being paged in the middle of the night. But we want to be sparing with that approach.</p><p id="a5db" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">It helps to get teams on board by pairing security fixes with other benefits, or working it into existing workflows so the process is essentially invisible to outside engineers.</p><p id="f3ab" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">We have tried top-down solutions in the past, but nothing has been as effective as treating engineering teams as our partners and seeking their input as stakeholders, making it a mutually beneficial process.</p><p id="eff3" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><strong class="kr ji">Maintain Accountability</strong>.Detecting vulnerabilities doesn’t help if there is no accountability to fix them. We want to make sure they are correctly attributed to business units so that leadership is held responsible for fixing vulnerabilities. Keeping business units accountable means that they will be incentivized to allocate resources towards the problem instead of keeping security issues on the backburner.</p><p id="11db" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">There should be a company-wide Service-Level Agreement (SLA) based on the severity of the vulnerability that is part of the success metrics for each org. If an SLA cannot be met, we offer automated SLA extension requests to allow teams to make adjustments. We avoid exceptions except when absolutely necessary and periodically review the reasoning.</p><h1 id="9873" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Building an Automated, Vendor-Agnostic Vulnerability Management Pipeline</h1><p id="5666" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Given the challenges, the standard industry advice just didn’t work for our use case. So, we decided to create our own engineering solution.</p><figure class="ms mt mu mv gy ki gm gn paragraph-image"><div role="button" tabindex="0" class="kj kk dp kl cf km"><div class="gm gn mw"><img alt="" class="cf kn ko" src="https://miro.medium.com/max/1400/0*G2KcgicUKe7FhslT" width="700" height="418" role="presentation" /></div></div><figcaption class="mx bm go gm gn my mz bn b bo bp co"><em class="na">Fig. 2: Automated Vulnerability Pipeline</em></figcaption></figure><h2 id="1175" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Step 1: Aggregate and Process Vulnerabilities</h2><p id="229a" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">First, we turned the barrage of alerts from our scanners into a standardized format, centralized in a single place, instead of having each scanning tool siloed from the others.</p><p id="a73e" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">In order to cleanly track vulnerabilities throughout the pipeline, we have developed a UUID generating process for every vulnerability type we encounter. This UUID is mappable to the vulnerability and the asset it is found in. This can change per vulnerability type–for example, for our third-party packages, we track vulnerabilities by <em class="od">asset</em> + <em class="od">package name</em> + <em class="od">package version.</em> It is less important to individually track every CVE present in the asset, given that fixing a package version will address every related CVE. When we detect that an asset is no longer using a specific package or version, we can feel confident that the vulnerability is no longer present. And unifying multiple CVEs into one UUID is easier to manage.</p><p id="2690" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">This also helps when scanners have some overlap. Instead of using the limited vendor features directly, we leverage their APIs to ingest the results so that we can process the alerts according to our needs, and allow us to deduplicate repetitive alerts and verify fixes. We can then combine results as needed or add additional information.</p><h2 id="fcbd" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Step 2: Contextualize Risk</h2><p id="ac7e" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Our next step was to automate risk assessment by taking the default severity calculation provided by the scanner and integrating additional context.</p><p id="ebfc" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">All companies have different infrastructure setups and mitigation strategies. For example, vulnerabilities involving DDOS aren’t as impactful if the load balancers have existing mitigations. On the other hand, a vulnerability in an application that handles PII can be much more severe than initially assessed by a scanner.</p><p id="3284" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">To improve the accuracy of our risk assessment, we take into account the following:</p><ul class=""><li id="e4b2" class="nb nc jh kr b ks kt kw kx la nd le ne li nf lm ng nh ni nj gb"><strong class="kr ji">Internal mitigations</strong>: Certain vulnerability types may not be relevant in our infrastructure.</li><li id="515f" class="nb nc jh kr b ks nk kw nl la nm le nn li no lm ng nh ni nj gb"><strong class="kr ji">Common Vulnerability Scoring System (CVSS) vector</strong>: The type of exploit is important. For example, if the attack requires local privileges and user interaction, the odds of exploitation are significantly lower, even if the impact is severe. Breaking down the CVSS base score by attack vector allowed us to gain a better understanding of the vulnerability risk.</li><li id="36d1" class="nb nc jh kr b ks nk kw nl la nm le nn li no lm ng nh ni nj gb"><strong class="kr ji">Asset risk</strong>: Is the asset public facing? Or is it an internal service that handles low-priority metadata? Does it work with PII? How critical is it to production flows?</li><li id="a38c" class="nb nc jh kr b ks nk kw nl la nm le nn li no lm ng nh ni nj gb"><strong class="kr ji">Multiple external scoring systems and metadata</strong>: Is there a difference between the National Vulnerability Database (NVD) rating, the Red Hat rating, and the vendor rating? Is this a package that hasn’t been updated in 3 years? Is this an old, unmaintained third-party package?</li></ul><p id="39f2" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">It’s critical that this step be automated as much as possible. We don’t want to have engineers reading through every CVE guide to determine how severe a vulnerability is.</p><p id="8869" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Depending on a company’s situation, tracking asset risk context may require engineering resourcing from both the security org and other engineering teams. This could involve engineering work to gather metadata from the codebase or cloud infra provider, or sorting through large existing datasets and compiling the most useful information for reference.</p><p id="5ad1" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">While that information may not always be available depending on the stage of the company, keeping track of asset information is a long-term investment that will pay out in dividends as a security program matures.</p><p id="efac" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">There’s no need to wait until all the relevant information is available–even a partial dataset is helpful to start, and the risk algorithm can be tuned as more data comes in. For example, in a microservice architecture, owners could fill out a yml file with attributes of the service on creation, and eventually graduate to more automated evaluation of service metadata.</p><h2 id="a18d" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Step 3: Reporting and Remediation</h2><p id="8032" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Once we’d standardized the format of the vulnerabilities and tracked them via UUIDs, we created a generic reporting service. Shared logic for ticket creation, closing, and metadata tracking is handled in the service so that we don’t need to constantly reinvent the wheel. We will go into more detail on the reporting service in the <em class="od">Implementation</em> section.</p><h2 id="757d" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Step 4. Verification</h2><p id="8960" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Once the vulnerability has been marked as fixed by the owner, we want to programmatically verify that it is truly gone. As our guiding principle states, humans are always the weak link in a process. Anyone can mistakenly close a ticket as fixed, and we can’t have 100% confidence unless we verify that the vulnerability is truly gone.</p><p id="54e4" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">For scanners that report state daily, once a UUID is no longer present, we can mark the vulnerability as verified. For more complicated ones, we can write separate jobs that pass the UUID status to the reporting service so tickets can be closed and/or verified. For example, if we are tracking a vulnerability ad-hoc, we can collect information from deployment pipelines to ensure that the patch has been successfully deployed.</p><h1 id="5b0e" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Implementation and Scaling</h1><p id="9238" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">We want to share our specific implementation for our pipeline, however, it is worth noting that there are many different technologies that could be leveraged to handle similar logic.</p><h2 id="eeb2" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Detection</h2><p id="f115" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Vulnerability scanners can be performance heavy. Moreover, deploying many vendor solutions and/or agents can increase an organization’s attack surface. It is key to understand what data the vulnerability scanner will provide and consider if an already rolled-out agent (i.e. Osquery, AWS Systems Manager Agent (SSM), etc) could provide similar output. So aside from a few necessary traditional vendor solutions, we primarily leveraged agents that were already being used for other purposes to identify security vulnerabilities.</p><h2 id="a2dc" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Airflow</h2><p id="3cf1" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Our process is primarily implemented in <a class="au ln" rel="noopener" href="https://medium.com/airbnb-engineering/airflow-a-workflow-management-platform-46318b977fd8">Airflow</a>, an open-source data processing tool released by Airbnb a few years ago. Airflow allows us to create scheduled jobs with upstream and downstream dependency management. Datasets can be tracked on any timeframe, and it is easy to backfill or rerun jobs on specific dates (in contrast to a standard cronjob).</p><p id="3220" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">The steps are implemented in <a class="au ln" href="https://airflow.apache.org/docs/apache-airflow/stable/concepts/overview.html" rel="noopener ugc nofollow" target="_blank">Directed Acyclic Graphs (DAGs)</a> that can be chained together. We use this to create a data processing pipeline to ingest the alerts and process them as mentioned in our pipeline explanation. Shared data like internal risk context, ingested vulnerability feeds, and NVD/Red Hat scoring criteria can be easily accessed at any point of the pipeline for writing contextualized risk logic.</p><h2 id="110a" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Reporting Service</h2><figure class="ms mt mu mv gy ki gm gn paragraph-image"><div role="button" tabindex="0" class="kj kk dp kl cf km"><div class="gm gn oe"><img alt="" class="cf kn ko" src="https://miro.medium.com/max/1400/0*J5vHCbaIQTgs_5sN" width="700" height="482" role="presentation" /></div></div><figcaption class="mx bm go gm gn my mz bn b bo bp co"><em class="na">Fig. 3: Reporting Service Logic</em></figcaption></figure><p id="13d6" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Now that we had the process for a single type of vulnerability, we wanted to be able to easily scale for any new type of vulnerability that we start tracking.</p><p id="8b5e" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">This is where the importance of the reporting service comes in. Our reporting service doesn’t care about the details of the vulnerability. All it needs is the data stored in a table with the expected schema and the client-provided callbacks to create the ticket copy.</p><p id="85a3" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">The modular nature of Airflow DAGs and the reporting service makes it simple to add a new vulnerability source into our systems. Our team doesn’t have to be responsible for writing every ingest / risk assessment process, and external teams using our pipeline don’t have to worry about managing the shared vulnerability tracking logic. Shared metadata can be reused over and over again, so things like risk assessment logic get easier every time.</p><figure class="ms mt mu mv gy ki gm gn paragraph-image"><div role="button" tabindex="0" class="kj kk dp kl cf km"><div class="gm gn of"><img alt="" class="cf kn ko" src="https://miro.medium.com/max/1400/0*gqspSTfgjHG6PCSj" width="700" height="513" role="presentation" /></div></div><figcaption class="mx bm go gm gn my mz bn b bo bp co"><em class="na">Fig. 4: How we scale our reporting service for any number of alert types.</em></figcaption></figure><h1 id="ccc4" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Results</h1><h2 id="9234" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Scaling</h2><p id="c5d1" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Before we standardized on this system, the vulnerability management team had to be much more involved in the vulnerability tracking process, creating a bottleneck. It was difficult to have a different team own a specific kind of scanner or vulnerability type, as we had to be deeply involved in how it was tracked.</p><p id="524d" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">After we rolled out this process, the number of vulnerabilities we were able to manage increased dramatically. <a class="au ln" rel="noopener" href="https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-3-34e592c45d46">Multiple security teams were able to integrate with our pipeline for their own purposes</a>, and unifying the functionality across our org allowed us to automatically get metrics that give us insight into our full attack surface.</p><h2 id="56c9" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Operational Work</h2><p id="2a90" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Our work with contextualizing risk also made an enormous difference in managing the operational work around risk assessment.</p><figure class="ms mt mu mv gy ki gm gn paragraph-image"><div class="gm gn og"><img alt="" class="cf kn ko" src="https://miro.medium.com/max/910/0*rNTgLOG3QUSPktNn" width="455" height="415" role="presentation" /></div><figcaption class="mx bm go gm gn my mz bn b bo bp co"><em class="na">Fig. 5: False positive rate over time</em></figcaption></figure><p id="98e6" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">For example, when we first deployed a new scanner, a large percentage of the alerts were false positives. We spent a lot of time going through the tickets daily to filter out the noise and identify the highest priority CVEs. Over the course of several months, we tuned our risk assessment algorithm to take in different kinds of criteria aside from the default severity score provided by our scanner. Now, we only occasionally have to manually review tickets to validate severity (primarily criticals, which can be difficult to distinguish from highs), and we trust that alerts are most likely accurate.</p><h2 id="acdb" class="np lp jh bn lq nq nr ns lu nt nu nv ly la nw nx mc le ny nz mg li oa ob mk oc gb">Case Study: log4shell</h2><p id="8c4c" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">A side effect of having a scalable vulnerability management system is that it is significantly easier to react quickly during critical incidents.</p><p id="b77c" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Like the rest of the internet, we had to act fast when the log4j vulnerability occurred. Several years ago, the work would have been both operationally and programmatically difficult, and would require significant time and resources. However, our new pipeline allowed us to respond much more quickly. We simply wrote a new DAG to track all services running Java and passed it into the reporting service. While our engineers were busy patching the services, we wrote a second DAG programmatically detecting if a service had been patched. We then were able to confidently verify the status of a service, while reopening tickets for incorrectly fixed ones.</p><h1 id="6606" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Takeaways and Suggestions</h1><p id="6ce5" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Vulnerability management is a hard problem to solve and getting to a solution that works best in a custom environment takes time. It is key for organizations to prioritize the problem space using automation to allow them to quickly address known attack surfaces.</p><p id="f753" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Vulnerability management should be treated as an engineering problem, not as an operational problem. If you have not yet adopted this approach, hopefully the benefits we have described convince you to take steps towards this goal. And just like every engineering solution, you will learn to adjust your approach as you gather more datasets and metrics about your environment.</p><p id="90b7" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">On top of detection and metrics tracking, it’s important to prioritize automation to address vulnerability root causes because metrics alone will not reduce attack surface. Prevention is always better than remediation.</p><p id="fe27" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Lastly, be creative with your solutions. Survey your existing tools, even ones not specifically geared towards security, to see if they can provide vulnerability insights. Your vulnerability management automation pipeline should be modular and vendor-agnostic to provide flexibility to incorporate all available data sources. The more you can reuse existing tools, the less additional attack surface you’ll need to maintain while still providing valuable signal.</p><h1 id="61fe" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">Acknowledgements</h1><p id="d4e7" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb">Thanks to Deanna Bjorkquist who has helped drive the Vulnerability Management program and automation requirements. Thanks to Derek Wang for code excellence and feature expansion. Thanks to Christopher Barcellos for reviewing and providing feedback for our blog post. Thanks to Tina Nguyen for helping drive and make this blog post possible. Thanks to Mark Vlcek for his work on some of our scanning solutions. Thanks to the internal Airbnb Airflow team for their technology support.</p><h1 id="c8ff" class="lo lp jh bn lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml gb">****************</h1><p id="5d40" class="pw-post-body-paragraph kp kq jh kr b ks mm ku kv kw mn ky kz la mo lc ld le mp lg lh li mq lk ll lm ja gb"><em class="od">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/sisyphus-and-the-cve-feed-vulnerability-management-at-scale-e2749f86a7a4</link>
      <guid>https://medium.com/airbnb-engineering/sisyphus-and-the-cve-feed-vulnerability-management-at-scale-e2749f86a7a4</guid>
      <pubDate>Wed, 10 Aug 2022 19:05:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Airbnb’s Approach to Access Management at Scale]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gp gq gr gs gt gu gv gw gx gy l"><div class="o gz u"><div class="o"><div class="fk l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@pbramsen?source=post_page-----cfa66c32f03c--------------------------------"><div class="l dp"><img alt="Paul Bramsen" class="l ci fm ha hb fq" src="https://miro.medium.com/fit/c/96/96/1*b1dSULdfhAWSKrzkaLk4IQ.png" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b dn do gb"><div class="hc o hd"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@pbramsen?source=post_page-----cfa66c32f03c--------------------------------">Paul Bramsen</a></div></div><div class="he hf hg hh hi d"></div></div><div class="o ao hu"><p class="pw-published-date bn b bo bp co">Aug 8</p><div class="hv cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">11 min read</div></div></div></div><div class="o ao"><div class="h k hw hx hy"><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="ic o ao"></div><div class="cl ii"></div></div></div><div class="ij ik il j i d"><div class="fk l"><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ja jb jc jd je"><div class=""><p id="5502" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><strong class="kg ji">How Airbnb securely manages permissions for our large team of employees, contractors, and call center staff.</strong></p><p id="059b" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><strong class="kg ji">By:</strong> <a class="au lc" href="https://www.linkedin.com/in/paul-bramsen-9a98638b/" rel="noopener ugc nofollow" target="_blank">Paul Bramsen</a></p><figure class="le lf lg lh gy li gm gn paragraph-image"><div role="button" tabindex="0" class="lj lk dp ll cf lm"><div class="gm gn ld"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1400/0*l86VJi1iVOnE7ngb" width="700" height="467" role="presentation" /></div></div></figure><h1 id="63c6" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Introduction</h1><p id="bab5" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Airbnb is a company that is built on trust. An important piece of this trust comes from protecting the data that our guests and hosts have shared with us. One of the ways we do this is by following the <a class="au lc" href="https://en.wikipedia.org/wiki/Principle_of_least_privilege" rel="noopener ugc nofollow" target="_blank">principle of least privilege</a>. Least privilege dictates that–in an ideal world–an employee has the exact permissions they need at the moment their job requires them. Nothing more, nothing less. Anything more introduces unnecessary risk–whether from a malicious employee, compromised laptop, or even just an honest mistake. Anything less inhibits productivity.</p><p id="8254" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Not only has enforcing least privilege always been crucial for maintaining trust, it’s rapidly becoming a legal necessity. Airbnb operates in <a class="au lc" href="https://news.airbnb.com/about-us/" rel="noopener ugc nofollow" target="_blank">almost every country and region in the world</a> necessitating that we comply with an ever increasing set of data privacy regulations.</p><p id="bb06" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Administrators can effectively solve these problems with minimal tooling in small companies when an individual can track the work of all colleagues. But as a company grows, this approach does not scale. In this post, we will explain how Airbnb uses a novel software solution to maintain least privilege while enabling our large team of employees, contractors, and call center agents to do our jobs effectively and efficiently.</p><h1 id="fe6d" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Where We Started</h1><p id="7f23" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">In Airbnb’s early days a combination of homegrown and vendor solutions were implemented, but the lack of a unifying architecture prevented us from scaling. The hodge-podge of systems used to control access made it difficult to hit either of our least privilege goals:</p><ul class=""><li id="d588" class="ms mt jh kg b kh ki kl km kp mu kt mv kx mw lb mx my mz na gb">It was often unclear where employees could get needed permissions, hampering productivity.</li><li id="0143" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">Projects aimed at reducing unnecessary access (i.e., drive least privilege) required significant effort across many systems. Integrating access control with a new system took months of engineering effort when it should have been one or two days.</li></ul><p id="bd6a" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Ultimately, these factors led to growing operational burden, reduced security, and increased hours required for compliance efforts. This led us to the following conclusion: <strong class="kg ji">we need a single place to manage employee access</strong>.</p><h1 id="da30" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Clarifying Focus</h1><p id="c32e" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Having determined the need for centralized access control, we worked to set guiding principles for the solution we would implement. Ultimately, we boiled down the requirements for our system to two goals:</p><ol class=""><li id="5402" class="ms mt jh kg b kh ki kl km kp mu kt mv kx mw lb ng my mz na gb">The access control system should manage the entirety of the processes and logic around a permission’s lifecycle. This includes:– Self-serve ways to request or revoke permissions.– Settings to control who has to approve new permissions.– Tools for managing groups of permissions.– Settings for automatic permission expiration.– Logging to meet operational and compliance requirements.– Notifications about relevant permission updates like upcoming expirations or when an approval is required.All of these features should be controlled declaratively for each available permission and the system should use these declarations to implement all necessary logic and actions.</li><li id="57d5" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb ng my mz na gb">We wanted to build a system that could easily and robustly integrate with any permission store (e.g., AWS IAM, LDAP, <a class="au lc" href="https://ranger.apache.org/" rel="noopener ugc nofollow" target="_blank">Apache Ranger</a>, MySQL, <a class="au lc" rel="noopener" href="https://medium.com/airbnb-engineering/himeji-a-scalable-centralized-system-for-authorization-at-airbnb-341664924574">Himeji</a>, etc) without the need to modify it. To use a network analogy, the permission stores would be the data plane that enforces authorization while our access control system is the control plane that coordinates everything. This requirement led us to focus on providing the interface needed to efficiently synchronize permission changes from the central access control system into the permission stores. This would be accomplished using a little glue code for each store (allowing us to maintain the generality of the central system).</li></ol><p id="1dce" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">We also clarified what the system would <em class="nh">not</em> be.</p><ul class=""><li id="2330" class="ms mt jh kg b kh ki kl km kp mu kt mv kx mw lb mx my mz na gb"><strong class="kg ji">Not</strong> a hyper-reliable, hyper-low-latency way to answer online permission checks. The permission stores themselves would answer online authorization queries and, because we would sync with them, they could automatically act as a cache if the central access control system went down. While availability and performance are always important, our primary focus would be on the permission management logic.</li><li id="b630" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb"><strong class="kg ji">Not</strong> a place to dump one-off authorization code. Some of the prior permission management systems had evolved into authorization code dumping grounds incurring significant technical debt.</li><li id="c920" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb"><strong class="kg ji">Not</strong> a place to store permissions for our guests and hosts. Public product permission management requirements are generally quite different from permissions we grant to employees to access our internal tooling and data to do their jobs. The scales also generally differ by many orders of magnitude. Additionally, internal permissions are usually significantly more complex. So it makes sense to handle each case separately.</li></ul><p id="de09" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">If you could only take one thing away from this post, take away these goals. Clarifying our focus and using these two goals as our north star was the most critical step in building our centralized access control platform.</p><h1 id="eb43" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Build Vs Buy</h1><p id="4cc0" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">We evaluated a number of products on the market but none of them solved for our specific goals. Generally, permissions were managed by a small group of knowledgeable administrators, operationalizing the approval process and failing our first goal. Additionally, integrations usually required modifying the client. While some of the permission stores already had plugins (e.g., LDAP plugin), others did not.</p><p id="60fd" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">We hope that eventually a startup builds a solution that implements a centralized, self-serve, easy-to-plug-in model. We think this could provide a lot of value to other companies that don’t have the scale to justify building an in-house solution like ours.</p><h1 id="3399" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Architecture</h1><figure class="le lf lg lh gy li gm gn paragraph-image"><div role="button" tabindex="0" class="lj lk dp ll cf lm"><div class="gm gn ni"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1400/1*L-FvRQ0fyPLiIS-LcYQiwg.png" width="700" height="194" role="presentation" /></div></div><figcaption class="nj bm go gm gn nk nl bn b bo bp co">Each stage makes requests to the prior stage as updates flow through the system from left to right. Note that for the purposes of this article we are only building stages 2 and 3. We can assume stages 1, 4, and 5 already exist.</figcaption></figure><p id="891c" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">We designed a system with a linear five-stage architecture. Changes flow from left to right. The architecture is linear in the sense that each stage can query the previous stage, but no others. For example the Access Control Platform can query Employee Data Systems and can be queried by Connectors, but never communicates directly with Permission Stores.</p><p id="0fbc" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">A stage can also have limited communication with the immediately prior stage through loosely coupled channels like queues or callbacks. For example the Access Control Platform can enqueue an update message that will be consumed by a Connector.</p><ol class=""><li id="4498" class="ms mt jh kg b kh ki kl km kp mu kt mv kx mw lb ng my mz na gb"><strong class="kg ji">Employee Data Systems</strong>These are the HR systems (e.g., LDAP) that contain employee data (e.g., title, location, status, management chain). The Access Control Platform ingests this data to enable features like dynamic groups based on title and approval flows based on management chain. These systems are owned by the IT team.</li><li id="6ba5" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb ng my mz na gb"><strong class="kg ji">Access Control Platform</strong>This is the core system. This includes all the business logic to manage permissions as well as the UI that employees use to make and/or approve changes. The Access Control Platform is highly configurable but does not directly interact with any permission stores that integrate with it. The security team owns this system.</li><li id="d9c9" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb ng my mz na gb"><strong class="kg ji">Connectors</strong>Connectors are the glue code that connects the Access Control Platform to the Permission Stores. They serve two purposes. First, connectors tell the Access Control Platform what permissions should be available for request. For example the data warehouse connector might make read access to the users and reservations Hive tables available for request. Secondly, if user bob received access to read reservations the data warehouse connector would synchronize this permission into the appropriate permission store — <a class="au lc" href="https://ranger.apache.org/" rel="noopener ugc nofollow" target="_blank">Apache Ranger</a> in this case. Since connectors are simply responding to messages on a queue by making the appropriate API calls, they can run in whatever environment their owner deems best (e.g., Kubernetes job, AWS Lambda, Airflow DAG). They are owned and operated by the team that owns the corresponding permission store. For example the storage team owns the MySQL connector.</li><li id="a239" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb ng my mz na gb"><strong class="kg ji">Permission Stores</strong>Permission stores are the systems that store the permissions and answer permission queries — for example, AWS IAM, LDAP, <a class="au lc" href="https://ranger.apache.org/" rel="noopener ugc nofollow" target="_blank">Apache Ranger</a>, MySQL’s built in permission system, <a class="au lc" rel="noopener" href="https://medium.com/airbnb-engineering/himeji-a-scalable-centralized-system-for-authorization-at-airbnb-341664924574">Himeji</a>, or other internal systems. Note that in some cases Permission Stores may be built into clients in which case stages 4 and 5 would be combined, as is the case for MySQL.</li><li id="30eb" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb ng my mz na gb"><strong class="kg ji">Clients</strong>The clients are all the systems that the end user needs — for example, SSH, Apache Superset, MySQL, internal customer support tools, Salesforce, etc.</li></ol><h1 id="fb9e" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Benefits Realized</h1><p id="6ee1" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Two years ago we implemented this architecture and since then we’ve integrated many systems into this centralized Access Control Platform. Here we highlight a few of the benefits we’ve realized.</p><h2 id="2aff" class="nm lq jh bn lr nn no np lv nq nr ns lz kp nt nu md kt nv nw mh kx nx ny ml nz gb">Security</h2><p id="8a6e" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">One of the biggest wins for security is having a single place where we can implement new least privilege features and then apply them across the board (as opposed to implementing once for AWS, once for MySQL, once for SSH, etc). A great case study is usage-based expiration. Usage-based expiration is a feature where permissions that have not been used for a significant period of time are automatically revoked. This approach is good for security because unnecessary permissions are quickly cleaned up. But it is also good for the user experience because employees can rest assured that the permissions being removed aren’t the ones they use regularly. Before the revocations happen, the Access Control Platform notifies impacted employees about the upcoming change and provides instructions on what to do if they need to keep the permissions. The notifications also provide links to low-friction ways to get the permissions back after the revokes happen should they realize later that the permissions were needed.</p><p id="c861" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">The chart below shows relative change in users with access to a core production system we’ll call System X. After rolling out usage-based expiration at the end of April, users with access to System X hit a steady state of about one third peak. A significant least privilege win! We saw similar results in other systems where we rolled out usage-based expiration.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div role="button" tabindex="0" class="lj lk dp ll cf lm"><div class="gm gn oa"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1400/0*ahfDNc4H4XLmdXfF" width="700" height="434" role="presentation" /></div></div><figcaption class="nj bm go gm gn nk nl bn b bo bp co">Users with System X access dropped by two thirds after enabling usage-based expiration in late April.</figcaption></figure><p id="a79f" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Another security benefit has been the ability to roll out consistent compliance changes across all systems as new regulations are introduced. For example, we could enable a rule that requires North American employees to get special approval from our European legal counsel in order to access certain protected data for European customers. This rule can be consistently applied across many systems such as online databases, offline datastores, and customer support tooling.</p><p id="a3c7" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Another win has been having a centralized database, against which we can create cross-system least privilege metrics and track our progress over time. The chart above was generated using this database.</p><h2 id="c239" class="nm lq jh bn lr nn no np lv nq nr ns lz kp nt nu md kt nv nw mh kx nx ny ml nz gb">Usability</h2><p id="4737" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Having a centralized access control platform has been a big win for usability. By consolidating, users no longer need to be aware of the N different places they need to go to request access. Effectively we’ve been able to create a one-stop-shop for all access at Airbnb. Just search for what you need access to and we’ll guide you through the rest.</p><p id="960b" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">The self-serve features we’ve built into the platform have helped reduce operational overhead. Employees can request permissions without having to involve a support engineer. When a manager goes out of town they can delegate a peer to approve changes on their behalf. Users have self-serve revoke for their own permissions, their reports’ permissions, or permissions for systems they manage.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div role="button" tabindex="0" class="lj lk dp ll cf lm"><div class="gm gn oa"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1400/0*RItgoPuQHokQDkup" width="700" height="434" role="presentation" /></div></div><figcaption class="nj bm go gm gn nk nl bn b bo bp co">Providing good self-serve access control tooling has significantly cut support costs.</figcaption></figure><h2 id="f5f7" class="nm lq jh bn lr nn no np lv nq nr ns lz kp nt nu md kt nv nw mh kx nx ny ml nz gb">Developer Experience</h2><p id="a902" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">We’ve put significant effort into making it as easy as possible for developers to build the connectors that link the Access Control Platform with permission stores. A large portion of this effort has been building great tools.</p><p id="0daa" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">As an example, a design decision we made that has proved extremely useful in providing a strong developer experience is notifying connectors about changes via an asynchronous message queue. Whenever a permission’s state changes, the Access Control Platform sends a message to the queue. The queue is processed by the connector that’s responsible for syncing the state of the updated permission into the permission store.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div role="button" tabindex="0" class="lj lk dp ll cf lm"><div class="gm gn ni"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1400/0*Yj_z9ZaQScTCHMKN" width="700" height="394" role="presentation" /></div></div><figcaption class="nj bm go gm gn nk nl bn b bo bp co">The permission’s state (granted or revoked) has to be fetched from the Access Control Platform. It is not included in the enqueued message.</figcaption></figure><p id="a725" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">The contents of the message are a critical part of the design. The message contains what the permission is and who it is for, but <em class="nh">not</em> whether the permission was granted or revoked. To get the current state (granted / revoked), the connector must query the platform. You can think of the message as a trigger to cause the system to resync permission X for user Y.</p><p id="e755" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">This design has the following properties:</p><ul class=""><li id="cee4" class="ms mt jh kg b kh ki kl km kp mu kt mv kx mw lb mx my mz na gb">Because the latest state is always fetched (granted / revoked), message processing is idempotent.</li><li id="f75d" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">This allows us to use at-least-once delivery semantics, greatly simplifying the process of ensuring that the proper messages are sent every time a permission changes. If a permission changes but the process is killed (perhaps due to a deploy) before we’ve recorded that the platform triggered the necessary update message, we just trigger the message again in a clean-up process.</li><li id="91e7" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">Replay attacks are nullified. So we let connector developers freely enqueue messages to aid in debugging. As a connector developer, this is quite useful when trying to determine why a permission sync is failing.</li><li id="272f" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">If updates do fail too many times, the message goes to a dead letter queue and the team responsible for the connector is alerted. Developers then use our tools to read the messages from the dead letter queue and debug the failing updates. Once issues are fixed, all failed messages can be re-enqueued which will bring all permissions back in sync.</li><li id="aad0" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">We run regular offline jobs to do bulk permission diffs and identify any permission changes that need backfilling. Then we trigger resyncs by enqueuing update messages for these permissions. This means that connector developers only need to write code to support incremental sync rather than both backfill and incremental sync. The backfills are free!</li></ul><h1 id="52aa" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Conclusion</h1><p id="2659" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Managing permissions and ensuring least privilege is a challenge at any company and especially difficult in large companies. Many companies come up with operationally heavy solutions that are expensive, insecure, and provide a negative user experience. At Airbnb, we’ve solved this challenge by implementing a centralized, self-serve access control platform. What made our investments such a success was solving Airbnb’s unique goals in a cohesive and scalable way, and what is very rare is the degree to which we’ve actually rolled this out in production. The majority of permissions at Airbnb are managed by our Access Control Platform. Our approach has enabled us to make huge strides in ensuring that we’re doing everything we can to keep our community’s data safe while at the same time enabling Airbnb’s employees to do our best work.</p><p id="1270" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">We’ve made a lot of progress in the access management space, but there is still a lot to do! If you’re interested in working on this or other efforts to protect Airbnb’s community, check out security and software engineering jobs on <a class="au lc" href="https://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">our careers page</a>.</p><h1 id="5f63" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Acknowledgments</h1><p id="4f91" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">The Access Control Platform we’ve built was the result of hard work from many collaborators at Airbnb. <a class="au lc" href="https://www.linkedin.com/in/zhusamuel/" rel="noopener ugc nofollow" target="_blank">Samuel Zhu</a>, <a class="au lc" href="https://www.linkedin.com/in/alissara-rojanapairat/" rel="noopener ugc nofollow" target="_blank">Alissara Rojanapairat</a>, <a class="au lc" href="https://www.linkedin.com/in/kyler-mejia-a9a323101/" rel="noopener ugc nofollow" target="_blank">Kyler Mejia</a>, <a class="au lc" href="https://www.linkedin.com/in/stephynancy/" rel="noopener ugc nofollow" target="_blank">Stephy Nancy</a>, and Maryna Butovych built significant portions of the system and contributed to the architecture. <a class="au lc" href="https://www.linkedin.com/in/alanyao/" rel="noopener ugc nofollow" target="_blank">Alan Yao</a> and <a class="au lc" href="https://www.linkedin.com/in/abhishek-parmar-924b529a/" rel="noopener ugc nofollow" target="_blank">Abhishek Parmar</a> provided invaluable feedback that influenced the architecture. <a class="au lc" href="https://www.linkedin.com/in/julia-k-cline/" rel="noopener ugc nofollow" target="_blank">Julia Cline</a> ensured that we were building a product that would meet the needs of our customers. <a class="au lc" href="https://www.linkedin.com/in/brettbukowski/" rel="noopener ugc nofollow" target="_blank">Brett Bukowski</a>, Jacqui Watts, <a class="au lc" href="https://www.linkedin.com/in/julia-k-cline/" rel="noopener ugc nofollow" target="_blank">Julia Cline</a>, <a class="au lc" href="https://www.linkedin.com/in/patmoynahan/" rel="noopener ugc nofollow" target="_blank">Pat Moynahan</a>, and <a class="au lc" href="https://www.linkedin.com/in/chris408/" rel="noopener ugc nofollow" target="_blank">Christopher B</a> provided valuable feedback on this blog post. <a class="au lc" href="https://www.linkedin.com/in/tinamn/" rel="noopener ugc nofollow" target="_blank">Tina Nguyen</a> and <a class="au lc" href="https://www.linkedin.com/in/laurenmackevich/" rel="noopener ugc nofollow" target="_blank">Lauren Mackevich</a> shepherded this blog post through the process. And many other colleagues contributed in small and large ways to make this possible.</p><h1 id="8608" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">****************</h1><p id="6d66" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb"><em class="nh">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/airbnbs-approach-to-access-management-at-scale-cfa66c32f03c</link>
      <guid>https://medium.com/airbnb-engineering/airbnbs-approach-to-access-management-at-scale-cfa66c32f03c</guid>
      <pubDate>Mon, 08 Aug 2022 19:02:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Incident Management]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gp gq gr gs gt gu gv gw gx gy l"><div class="o gz u"><div class="o"><div class="fk l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@vlad.vassiliouk?source=post_page-----ae863dc5d47f--------------------------------"><div class="l dp"><img alt="Vlad Vassiliouk" class="l ci fm ha hb fq" src="https://miro.medium.com/fit/c/96/96/1*aePErJ19vpsRkJCfRj7i9w.png" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b dn do gb"><div class="hc o hd"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@vlad.vassiliouk?source=post_page-----ae863dc5d47f--------------------------------">Vlad Vassiliouk</a></div></div><div class="he hf hg hh hi d"></div></div><div class="o ao hu"><p class="pw-published-date bn b bo bp co">Jul 27</p><div class="hv cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">8 min read</div></div></div></div><div class="o ao"><div class="h k hw hx hy"><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="ic o ao"></div><div class="cl ii"></div></div></div><div class="ij ik il j i d"><div class="fk l"><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ja jb jc jd je"><div class=""><p id="55aa" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">How Airbnb automates incident management in a world of complex, rapidly evolving ensemble of microservices.</p><p id="82f5" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb"><a class="au lc" href="https://www.linkedin.com/in/vladimir-vassiliouk" rel="noopener ugc nofollow" target="_blank">Vlad Vassiliouk</a></p><figure class="le lf lg lh gy li gm gn paragraph-image"><div role="button" tabindex="0" class="lj lk dp ll cf lm"><div class="gm gn ld"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1400/1*hP8PSrLw_LTyLjhbR6LRxg.jpeg" width="700" height="467" role="presentation" /></div></div></figure><h1 id="21cb" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Incident Management</h1><p id="7e4c" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Incidents are unforeseeable events that disrupt normal business operations and are inevitable in complex systems that must be up and running 24/7. This is why it’s important to prepare and to train people to handle incidents in a timely and organized manner. Although each incident is unique, we follow the same procedure for detection, escalation, management, and resolution of incidents.</p><p id="0405" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">At Airbnb, we utilize a <a class="au lc" rel="noopener" href="https://medium.com/airbnb-engineering/a-krispr-approach-to-kubernetes-infrastructure-a0741cff4e0c">service oriented infrastructure</a> which involves many interconnected services managed by small teams. Quickly figuring out what service is in trouble, and who to page is paramount to timely incident resolution. We found that our teams spent a lot of time switching between applications such as Slack, Pagerduty and Jira to raise an incident, page responders, and provide context. In order to have quick resolutions of incidents, we developed an incident management bot, a centralized automation tool for incident management.</p><h1 id="52d0" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Incident Management Slack bot</h1><p id="62a8" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Our goal was to centralize incident management in Slack. Everyone at Airbnb is familiar with and has access to Slack, and it’s easy to bring people and resources together in an incident channel. In addition, the incident channel acts like a timeline of events which makes putting together a post mortem report easy.</p><p id="2a98" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Our requirements were as follows:</p><ul class=""><li id="6b3f" class="ms mt jh kg b kh ki kl km kp mu kt mv kx mw lb mx my mz na gb">Run in Airbnb’s <a class="au lc" rel="noopener" href="https://medium.com/airbnb-engineering/a-krispr-approach-to-kubernetes-infrastructure-a0741cff4e0c">service oriented infrastructure</a> and have full support from our team.</li><li id="9e43" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">Standardize incident-related communications in all tools such as Jira, Slack, PagerDuty.</li><li id="83ba" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">Centralize incident management in Slack.</li><li id="9655" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">Single intake funnel for incidents with clearly defined steps.</li><li id="87f7" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">Automate post-incident tasks such as setting up meetings and archiving channels.</li><li id="5218" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb">Provide incident timelines and metrics.</li></ul><p id="e9b1" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">We decided to build our own app to meet our exact specifications and allow us to easily customize and develop further. We also chose to build the app in Golang, because of the great community, and their well documented <a class="au lc" href="https://pkg.go.dev/github.com/slack-go/slack" rel="noopener ugc nofollow" target="_blank">slack library</a>.</p><p id="c40a" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Finally, we decided to use chat commands instead of <a class="au lc" href="https://slack.com/help/articles/201259356-Slash-commands-in-Slack" rel="noopener ugc nofollow" target="_blank">slash commands</a> so that all commands sent to the bot would be visible to the members of the Slack channel.</p><p id="26f4" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Our incident management bot achieves incident response automation through four key commands:</p><ul class=""><li id="8d69" class="ms mt jh kg b kh ki kl km kp mu kt mv kx mw lb mx my mz na gb"><strong class="kg ji">new incident &lt;summary&gt;: </strong>Create a Jira ticket and page incident managers.</li><li id="107f" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb"><strong class="kg ji">new channel &lt;ticket&gt;: </strong>Create an incident Slack channel for an open incident ticket.</li><li id="6235" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb"><strong class="kg ji">page &lt;service|user&gt;: </strong>Page the on-call(s) for a PagerDuty Service or a user directly.</li><li id="4d57" class="ms mt jh kg b kh nb kl nc kp nd kt ne kx nf lb mx my mz na gb"><strong class="kg ji">get timeline:</strong> Compile a concise timeline of important chat events for post-incident analysis.</li></ul><h1 id="a8c2" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Incident Response Lifecycle</h1><p id="6e0c" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">We have defined four separate phases of an incident: detection, communication, escalation and resolution. Each of the bot’s commands automates tasks that would normally require coordination during these distinct phases.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div role="button" tabindex="0" class="lj lk dp ll cf lm"><div class="gm gn ng"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1400/0*x2vbM-iepMqqsrH9" width="700" height="188" role="presentation" /></div></div></figure><h1 id="5853" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Detection</h1><p id="b201" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Most of our incidents are detected by our <a class="au lc" rel="noopener" href="https://medium.com/airbnb-engineering/alerting-framework-at-airbnb-35ba48df894f">monitoring and alerting tools</a>, although sometimes we learn about incidents from our team members or customers. No matter how an incident is detected, having a single intake funnel for all incidents is crucial for effective incident detection. Our bot solves this by providing the “new incident” command.</p><h2 id="91ef" class="nh lq jh bn lr ni nj nk lv nl nm nn lz kp no np md kt nq nr mh kx ns nt ml nu gb">New incident &lt;summary&gt;</h2><p id="7753" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">This command creates a blank JIRA ticket with default settings and asks the user if they’d like to page an incident manager.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div role="button" tabindex="0" class="lj lk dp ll cf lm"><div class="gm gn nv"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/932/0*W275nEZINolYZ6pn" width="466" height="235" role="presentation" /></div></div></figure><p id="d185" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Regardless of the user’s choice to page an incident manager, a popup appears to the user asking for additional information.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div class="gm gn nw"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1038/0*cwNt9XjvCIiVXNS2" width="519" height="641" role="presentation" /></div></figure><p id="b5cc" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">This allows us to escalate incidents quickly while still allowing the incident responder to provide valuable information for the incident managers. These fields are optional in the interests of urgency and can be filled out later if needed.</p><h1 id="d070" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Communication</h1><p id="51b2" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Another important first step is to set up communication channels and provide as much context as possible to responders.</p><h2 id="cba0" class="nh lq jh bn lr ni nj nk lv nl nm nn lz kp no np md kt nq nr mh kx ns nt ml nu gb">New channel [Jira ticket]</h2><p id="610a" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">This command takes an optional Jira ticket as a URL or key. If none is provided, it will show the last 5 recently opened incident tickets for the user to choose. A channel is then created using the Jira ticket key, the summary as the title, and all incident managers are invited.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div class="gm gn nx"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/990/0*TfXcDuFds_fv8K-8" width="495" height="190" role="presentation" /></div></figure><p id="d451" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">To provide context to all users invited, the channel’s topic is set to the Jira ticket link along with the summary of the Jira ticket. In addition, we update the Jira ticket with a link to the newly created Slack channel.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div class="gm gn ny"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/656/0*2WJcyBs7LocZ4yjP" width="328" height="110" role="presentation" /></div></figure><h1 id="b9b9" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Escalation</h1><p id="552c" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">You may have heard about the<a class="au lc" href="https://nvd.nist.gov/vuln/detail/CVE-2021-44228" rel="noopener ugc nofollow" target="_blank"> Log4j security vulnerability</a> which was characterized as the single biggest and most critical vulnerability of the last decade. Within 72 hours of vulnerability disclosure, there were reports of <a class="au lc" href="https://arstechnica.com/information-technology/2021/12/hackers-launch-over-840000-attacks-through-log4j-flaw" rel="noopener ugc nofollow" target="_blank">840,000 attacks</a> on companies globally, which turned into 100 internet wide attacks per minute over the following weekend.</p><p id="614b" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">At Airbnb, we have over a thousand micro services with hundreds of small teams managing them, which offered a unique challenge for us. We had to identify all vulnerable services, and quickly reach out to their respective owners for quick mitigation. This is where our Slack bot really shined, allowing our Incident Managers to quickly reach out to service owners and coordinate rolling out the fix much quicker than before. In a matter of minutes, the bot was used to page over 300 teams to assist with assessing impact and deploying patches. This equated to 4 hours saved compared to paging these teams manually, not to mention reducing the time spent in a vulnerable state.</p><h2 id="3661" class="nh lq jh bn lr ni nj nk lv nl nm nn lz kp no np md kt nq nr mh kx ns nt ml nu gb">Page &lt;shortcut|service name|slack user&gt;</h2><p id="f826" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">The page command can be given a service shortcut, service name, or a slack user.</p><p id="9ffc" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">To get started, the user can view a list of shortcuts by typing in “page list”</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div class="gm gn nz"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/528/0*DVEqiuHCfuSqyruA" width="264" height="267" role="presentation" /></div></figure><p id="389f" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Each shortcut corresponds to a PagerDuty service ID which will be used when creating a PagerDuty incident. The shortcuts are easily customizable by editing a YAML file.</p><p id="bebe" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">If a user types in a service name which doesn’t match any shortcut, a search is done in the PagerDuty service directory and results are displayed for the user to choose.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div class="gm gn oa"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/852/0*yym1kmhJGZTzsa4c" width="426" height="191" role="presentation" /></div></figure><p id="e769" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Once a user chooses the service they want to page they’re asked to confirm and a new PagerDuty incident is created for that service.</p><p id="e034" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">We also allow paging Slack users directly for when additional responders are required outside of those on-call.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div class="gm gn ob"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1240/0*pw130BpOe5IFD7CZ" width="620" height="178" role="presentation" /></div></figure><p id="94c2" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Once the page command is sent, the bot creates a new incident in PagerDuty with the Jira ticket, summary, and slack channel to provide context to the on-call person. After the on-call person is paged, the bot announces who was paged and invites them to the channel.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div class="gm gn oc"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/914/0*f0cPvK_oG91okdgm" width="457" height="136" role="presentation" /></div></figure><h1 id="cf77" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Resolution</h1><p id="0fe6" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Once responders confirm there is no further user impact and a root cause is known, the incident is considered resolved and the team transitions to the post-incident phase. A robust timeline is required to have an effective post-incident review and an effective <a class="au lc" href="https://www.atlassian.com/incident-management/postmortem" rel="noopener ugc nofollow" target="_blank">post mortem</a> report.</p><h2 id="e246" class="nh lq jh bn lr ni nj nk lv nl nm nn lz kp no np md kt nq nr mh kx ns nt ml nu gb">Get timeline</h2><p id="b9e0" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">This command will search the incident channel for all chat messages marked with a specific emoji which designates the message as a timeline event, and direct message the user a compiled timeline.</p><p id="1572" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">For example, we use the ? emoji to designate important events in the chat. As the incident is ongoing, anyone can add the emoji as a reaction to important chat events. Post-incident, the “get timeline” command will compile these chat events into an easy to copy paste timeline to be used in the post-incident report.</p><figure class="le lf lg lh gy li gm gn paragraph-image"><div class="gm gn nv"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/932/0*vL094xETdA2Re2cR" width="466" height="78" role="presentation" /></div></figure><figure class="le lf lg lh gy li gm gn paragraph-image"><div class="gm gn od"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/944/0*Z2WAZ9sw8X_Fg0-J" width="472" height="356" role="presentation" /></div></figure><h2 id="19fe" class="nh lq jh bn lr ni nj nk lv nl nm nn lz kp no np md kt nq nr mh kx ns nt ml nu gb">Incident Review</h2><p id="84a0" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">At Airbnb, we have after action review meetings (AAR) weekly where we review recent high severity incidents, post-incident reports, and ensure any corrective actions are called out and assigned. As soon as the Jira ticket tracking the incident is updated with the AAR meeting date, the bot will notify the person owning the Jira ticket when the meeting will be and what is expected of them.</p><h2 id="a7d0" class="nh lq jh bn lr ni nj nk lv nl nm nn lz kp no np md kt nq nr mh kx ns nt ml nu gb">Followup Tracking</h2><p id="e2b1" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Oftentimes, during our <a class="au lc" href="https://www.atlassian.com/incident-management/postmortem/blameless" rel="noopener ugc nofollow" target="_blank">blameless postmortem process</a>, tickets for corrective actions are created and assigned to teams to avoid similar incidents in the future. To encourage quick resolutions we set a strict deadline for these tickets. Our bot will send a warning message over Slack a couple of days before the deadline, and another message if the deadline has lapsed to the user assigned to the ticket.</p><h2 id="fdac" class="nh lq jh bn lr ni nj nk lv nl nm nn lz kp no np md kt nq nr mh kx ns nt ml nu gb">Archiving Incident Channels</h2><p id="0348" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">To keep our Slack workspace tidy, the bot automatically archives incident channels ten days after the incident’s Jira ticket has been closed.</p><h1 id="1419" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Results</h1><p id="bf4a" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">Since launch, our bot has saved our Incident Managers and responders many hours through its automation and centralization of incident management within Slack. By measuring the average amount of time each task takes to complete manually compared to the bot’s automation, we determined an estimated 44 hours of time saved so far in 2022.</p><h1 id="e735" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">What’s Next?</h1><p id="5385" class="pw-post-body-paragraph ke kf jh kg b kh mn kj kk kl mo kn ko kp mp kr ks kt mq kv kw kx mr kz la lb ja gb">To further streamline our incident response from Slack, we plan to enhance our integration with PagerDuty.</p><p id="1537" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Currently, every time the page command is used a new PagerDuty incident is created. Instead, we plan to unify all pages under a single PagerDuty incident to take advantage of PagerDuty’s incident metrics and to provide more context to responders.</p><p id="f346" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">Lastly, after a PagerDuty service is paged using the bot, we don’t have visibility of the status of the PagerDuty incident in Slack. Was the page acknowledged? Did the on-call not respond? Was it escalated and to who? We plan to build automation to follow the PagerDuty incident and report the current status to the incident’s channel. This will also allow us to record the timeline of actions taken in the PagerDuty incident after paging the service.</p><h1 id="7bf4" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Attribution and Thanks</h1><ul class=""><li id="2a44" class="ms mt jh kg b kh mn kl mo kp oe kt of kx og lb mx my mz na gb"><a class="oh eu ck" href="https://medium.com/u/af76cda83a53?source=post_page-----ae863dc5d47f--------------------------------" rel="noopener" target="_blank">Stephen</a>: for being a great partner on the Airbnb Incident Management team and helping to define the incident management bot’s feature roadmap</li></ul></div><div class="o dy oi oj ij ok" role="separator"><div class="ja jb jc jd je"><p id="a161" class="pw-post-body-paragraph ke kf jh kg b kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb ja gb">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</p></div></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/incident-management-ae863dc5d47f</link>
      <guid>https://medium.com/airbnb-engineering/incident-management-ae863dc5d47f</guid>
      <pubDate>Wed, 27 Jul 2022 18:39:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey to Airbnb — Beti Gathegi]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gp gq gr gs gt gu gv gw gx gy l"><div class="o gz u"><div class="o"><div class="fk l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@airbnbeng?source=post_page-----61c2db3d8546--------------------------------"><div class="l dp"><img alt="AirbnbEng" class="l ci fm ha hb fq" src="https://miro.medium.com/fit/c/96/96/1*PrgppbVAePgtuFs2XZa8Ig.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b dn do gb"><div class="hc o hd"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@airbnbeng?source=post_page-----61c2db3d8546--------------------------------">AirbnbEng</a></div></div><div class="he hf hg hh hi d"></div></div><div class="o ao hu"><p class="pw-published-date bn b bo bp co">Jul 21</p><div class="hv cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">7 min read</div></div></div></div><div class="o ao"><div class="h k hw hx hy"><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="ic o ao"></div><div class="cl ii"></div></div></div><div class="ij ik il j i d"><div class="fk l"><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="iq l fs"><div><div class="cj" aria-hidden="false"></div></div><div class="l fs"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="ja jb jc jd je"><div class=""><figure class="gq gs kf kg kh ki gm gn paragraph-image"><div role="button" tabindex="0" class="kj kk dp kl cf km"><div class="gm gn ke"><img alt="" class="cf kn ko" src="https://miro.medium.com/max/1400/1*mhJHc6Qqjet_mKa-bzwu-w.jpeg" width="700" height="467" role="presentation" /></div></div></figure><p id="b134" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">From exploring careers across continents to now helping others find their place at Airbnb.</p><p id="86c9" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><em class="ln">After trying a series of careers ranging from television production to university communications and marketing, </em><a class="au lo" href="https://www.linkedin.com/in/betigathegi" rel="noopener ugc nofollow" target="_blank"><em class="ln">Beti Gathegi</em></a><em class="ln"> works as a Senior Program Manager on the TechED (technical education) team at Airbnb. When she’s not lurking in the #bookworms Airbnb Slack channel, you can find Beti leading Bootcamp, our onboarding program for new technical hires, which takes engineers and data scientists through their first commit at Airbnb. Before this role, Beti was a recruiting program manager for Connect, Airbnb’s engineering apprenticeship program targeted at people from non-traditional technical backgrounds.</em></p><p id="124b" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><em class="ln">Beti herself has a non-traditional background, with a degree in journalism and several experiences outside the tech industry, including substantial time abroad. She is a major advocate for diversity and inclusion; part of her role in leading Bootcamp involves setting the company’s culture and encouraging new hires to shape the culture in their own unique ways.</em></p><h1 id="0bf8" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Setting my own direction</h1><p id="f8a1" class="pw-post-body-paragraph kp kq jh kr b ks mn ku kv kw mo ky kz la mp lc ld le mq lg lh li mr lk ll lm ja gb">I describe myself as half East Coast, half West Coast, with a bit of time abroad added in. I’m the child of Kenyan immigrants and I grew up in the San Francisco Bay Area, in a town called Albany, California. When I was 15, I moved to the East Coast, and it would be many years before I found myself back in the Bay Area.</p><p id="66e5" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">For a long time, I wanted to be a journalist. To that end, I studied journalism in college as part of my communications degree. I was never fixated on a specific path and certainly explored a lot to reach where I am now. My father, who pivoted later in life by getting a law degree around age 40, was my guiding light in terms of being willing to try new things. I find personal exploration liberating — crafting my own, organic path gave me a chance to figure out my likes and dislikes, as well as my skills and growth opportunities.</p><p id="e58f" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Part of my outlook on life is that it’s okay to stop something that isn’t right for yourself. Sometimes there can be a lot of inertia that makes it hard to pause and change directions, but I think making a decision to pursue another path is really brave and can be worn as a badge of honor. In my case, I started a master’s in liberal arts in which I was studying the South Asian diaspora and the children of Indian immigrants in particular. Inspired by the stories of others, I was eager to discover more about my own background and history. I chose to leave my program to go live in Kenya and experience Kenyan culture for myself. Until that point, I’d only been to Kenya with my family, so this was a new lens to see the country on my own.</p><h1 id="7089" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Living and working in three continents</h1><p id="6710" class="pw-post-body-paragraph kp kq jh kr b ks mn ku kv kw mo ky kz la mp lc ld le mq lg lh li mr lk ll lm ja gb">Living in Kenya was a transformative experience and helped me understand my own identity more deeply. Having previously been told, by some, that I’m not Kenyan enough or not American enough, actually living in Kenya and encountering the sheer diversity of people made me realize there’s no singular way to be Kenyan, just like there’s no one way to be American or any culture for that matter. During my time abroad, I also realized I was ready to get more hands-on experience and enter the working world.</p><p id="82fb" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Ready to live a life of adventure, I moved to New York City but stumbled into a financial crisis when seemingly everyone was getting laid off. I worked retail for a little while but otherwise didn’t last too long in New York. This was just the first in a series of new experiences, my next being at a TV production firm where I was an assistant, and where to this day I have an IMDB credit for four episodes of the show <em class="ln">Swamp Men </em>on National Geographic. If that wasn’t enough, I also had jobs writing TV quizzes for Nielsen, doing marketing for the University of South Florida, and working at an Australian Aboriginal art gallery.</p><p id="451e" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Eventually, I happened upon the tech industry when a friend recruited me to join Lyft in a customer support role. This was a completely new universe to me and I took every opportunity to get involved and apply my skills to a growing company. Practically by accident, my initiative to gather people via the company’s internal email list turned into Employee Resource Groups or ERGs. I helped form the Black ERG and Women’s luncheon, while also supporting others who wanted to create similar spaces for their communities. Very organically, I was taking a big part in the diversity and belonging conversation and making sure to educate myself to be a thoughtful contributor to these discussions. Later, this turned into an official job focused on Lyft’s culture, and afterward I moved to Pandora as a diversity and belonging program manager.</p><h1 id="0564" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Onboarding to Airbnb</h1><p id="cbe6" class="pw-post-body-paragraph kp kq jh kr b ks mn ku kv kw mo ky kz la mp lc ld le mq lg lh li mr lk ll lm ja gb">By the time I applied to join Airbnb, I had shifted to leading Pandora’s university recruiting program. I noticed a tremendous amount of potential in students, and found it really impactful to work on this key pipeline. That said, going directly from college to the tech industry isn’t the only way, my own career being a prime example. I jumped at the opportunity to join Airbnb as an apprenticeship program manager where I had the chance to revamp this pathway for engineers with unconventional backgrounds to join the company.</p><p id="385b" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">I ran the apprenticeship program for two years and while the role was primarily on the recruiting side, we worked very closely with engineering and I started to develop an interest in TechED, the team I’m on now that owns the onboarding process for Airbnb’s technical hires. I was already meeting regularly with my manager at the time, Leo, about growing in my career, so I started by sharing my goals in that setting. Leo was incredibly supportive and epitomizes how Airbnb has a culture of empowering people to explore their passions and what energizes them.</p><p id="b2de" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">I reached out to the manager on the TechED team to express my interest in the space and not too long after, a role happened to open up. After interviewing, I got my current position leading Airbnb’s Bootcamp program, the onboarding process all software engineers and data scientists go through in their first weeks at the company. It took a ton of experiences across many other roles to arrive at my current spot, but that came with immeasurable learning and I wouldn’t have it any other way! I feel uniquely equipped to welcome people to a new experience or challenge, having gone through so many of my own.</p><h1 id="054f" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">Leading Bootcamp and onboarding others to Airbnb</h1><p id="27f6" class="pw-post-body-paragraph kp kq jh kr b ks mn ku kv kw mo ky kz la mp lc ld le mq lg lh li mr lk ll lm ja gb">My current role aligns with my passion for helping people acclimate and providing them with the resources they ended to be successful. It’s fulfilling to advocate for an incredible new hire experience, help new team members feel confident in their respective roles at Airbnb, and support them towards making their first commit or code change. I strive to be really respectful of people’s time and make Bootcamp as relevant, engaging, and valuable as possible to everyone who participates.</p><p id="8ee4" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Onboarding is a challenging task because there are multiple variables. Typically you are onboarding people in various roles, at various levels, to various teams, which may use their own tools and process. There is a balancing act between providing general information and hyper-relevant but also highly specific information. Remote onboarding also adds its own set of challenges.</p><p id="7913" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">That being said, I love co-creating solutions. I get to work with incredibly smart people on the engineering and data science teams to identify and clarify our challenges, workshop ideas, execute solutions, monitor progress, and iterate. I get a ton of energy from that process and from our collaborations.</p><h1 id="c483" class="lp lq jh bn lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm gb">How the first weeks at work can leave a lasting impact</h1><p id="d380" class="pw-post-body-paragraph kp kq jh kr b ks mn ku kv kw mo ky kz la mp lc ld le mq lg lh li mr lk ll lm ja gb">I’m also grateful to the many volunteers I partner with to shape the onboarding experience for our technical hires and set them up for success. For example, we pair each new hire with a buddy from their team. They serve both to scope the hire’s starter project as well as to answer the many questions that inevitably pop up in onboarding.</p><p id="14d6" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">We have volunteers from various teams who raise their hands to host Bootcamp and lead sessions for each new hire cohort, and most of them are driven by providing a sense of belonging. Additionally, there’s a great community of collaborators across the industry to benchmark with and get mentorship from, since onboarding is a challenging problem that a lot of companies work on.</p><p id="4ce5" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb">Beyond the technical parts of onboarding, Bootcamp plays a critical role in setting Airbnb’s culture. Especially in a remote work environment, the quality of onboarding can make or break whether new hires feel a sense of community and feel comfortable engaging with it themselves. We emphasize belonging and inclusivity as core values of our culture, and we welcome new hires to bring their own special qualities to integrate into our ever-evolving culture.</p><p id="c9de" class="pw-post-body-paragraph kp kq jh kr b ks kt ku kv kw kx ky kz la lb lc ld le lf lg lh li lj lk ll lm ja gb"><em class="ln">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-beti-gathegi-61c2db3d8546</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-beti-gathegi-61c2db3d8546</guid>
      <pubDate>Thu, 21 Jul 2022 19:28:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[How Airbnb Safeguards Changes in Production]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@michaelcsh?source=post_page-----9fc9024f3446--------------------------------"><div class="l dq"><img alt="Michael Lin" class="l ci fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/0*Nd5W8pG_Yme_8QnZ" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gc"><div class="hd o he"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@michaelcsh?source=post_page-----9fc9024f3446--------------------------------">Michael Lin</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bn b bo bp co">Jul 11</p><div class="hw cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">8 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div><div class="cl ie"><div></div></div></div><div class="if ig ih j i d"><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><div class=""><h2 id="920b" class="pw-subtitle-paragraph jx iz ja bn b jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko co">Part I: Evolution of Airbnb’s experimentation platform</h2></div><p id="2342" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">By: <a class="au ll" href="https://www.linkedin.com/in/michaelcl/" rel="noopener ugc nofollow" target="_blank">Michael Lin</a>, <a class="au ll" href="https://www.linkedin.com/in/toby-mao/" rel="noopener ugc nofollow" target="_blank">Toby Mao</a>, <a class="au ll" href="https://www.linkedin.com/in/zack-loebel-begelman-85407698/" rel="noopener ugc nofollow" target="_blank">Zack Loebel-Begelman</a></p><figure class="ln lo lp lq gz lr gn go paragraph-image"><div role="button" tabindex="0" class="ls lt dq lu cf lv"><div class="gn go lm"><img alt="" class="cf lw lx" src="https://miro.medium.com/max/1400/0*0J4whYTNqGPUdUme" width="700" height="468" role="presentation" /></div></div></figure><h1 id="d488" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Introduction</h1><p id="6233" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">As Airbnb has grown to a company with over 1,200 developers, the number of platforms and channels for pushing changes to our product — and the number of daily changes we push into production — has also grown tremendously. In the face of this growth, we constantly need to scale our ability to detect errors before they reach production. However, errors inevitably slip past pre-production validation, so we also invest heavily in mechanisms to detect errors quickly when they do make it to production. In this blog post we will cover the motivations and foundations for a system for safeguarding changes in production, which we call Safe Deploys. Two following posts will cover the technical architecture in detail for how we applied this to traditional A/B tests, and code deploys respectively.</p><h1 id="8060" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Continuous Delivery and Beyond</h1><p id="7661" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">Airbnb’s continuous delivery team recently wrote about <a class="au ll" rel="noopener" href="https://medium.com/airbnb-engineering/continuous-delivery-at-airbnb-6ac042bc7876">our adoption of Spinnaker</a>, a modern CI/CD orchestrator. Spinnaker supports <a class="au ll" href="https://spinnaker.io/docs/guides/user/canary/" rel="noopener ugc nofollow" target="_blank">Automated Canary Analysis (ACA)</a> during deployment, splitting microservice traffic by request to compare versions of code to see if performance, error rates, or other key metrics are negatively impacted. If metrics for the new version regress, Spinnaker automatically rolls back the deployment, significantly reducing the time to remediate a bad push.</p><p id="4797" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">ACA at Airbnb has indeed caught a large number of errors early in the deployment process. However, it has a number of limitations:</p><ul class=""><li id="b0ff" class="mv mw ja kr b ks kt kv kw ky mx lc my lg mz lk na nb nc nd gc"><strong class="kr jb">Channels: </strong>Spinnaker’s ACA tests against changes to microservices. However, microservice updates are not the only source of errors that can be pushed into production. For instance, Android and iOS apps follow a release process through their respective app stores. Many “production pushes” at Airbnb may involve no new code at all, and are strictly applied through configuration changes. These changes include marketing campaigns or website content created with Airbnb’s <a class="au ll" rel="noopener" href="https://medium.com/airbnb-engineering/airbnbs-promotions-and-communications-platform-6266f1ffe2bd">internal content management systems</a>. While seemingly benign, pushes through these systems can have dramatic effects. For example an incident was once caused when a marketing campaign was mistakenly applied to all countries except one, instead of the original intent of targeting one specific country. This simple mistake led to empty search results for nearly all users globally, and required over an hour to identify and revert.</li><li id="0ddc" class="mv mw ja kr b ks ne kv nf ky ng lc nh lg ni lk na nb nc nd gc"><strong class="kr jb">End-to-end business metrics: </strong>Spinnaker’s ACA is driven by local system metrics, such as a microservice’s local performance and error rates; not end-to-end business metrics, such as search click-through rates and booking rates. While roll-backs based on local system metrics are valuable, they aren’t sufficient, as some of our most costly bugs impact end-to-end business metrics but not local system metrics. For instance in 2020, a simple frontend change was deployed to production without being tested on a specific browser that did not support the CSS used, preventing users on that browser from booking trips. This had no impact on system metrics, but directly impacted business metrics. <p>Unfortunately, adding business metrics to Spinnaker’s ACA system is not possible because Spinnaker randomizes traffic by request, therefore the same user may be exposed to multiple variants. Business metrics, however, are generally user based and require each user to have a fixed variant assignment. More fundamentally, it’s not possible because business metrics need to be measured end-to-end and when two microservices undergo ACA at the same time, Spinnaker has no way of distinguishing the respective impact of those two services on end-to-end business metrics.</p></li><li id="7bca" class="mv mw ja kr b ks ne kv nf ky ng lc nh lg ni lk na nb nc nd gc"><strong class="kr jb">Granularity: </strong>Spinnaker’s ACA tests at the level of the entire microservice. However, it’s often the case that two features are being worked on at the same time within a microservice. When ACA fails, it can be hard to tell which feature caused the failure.</li></ul><p id="a503" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">While we heavily depend upon Spinnaker’s ACA at Airbnb, it became clear there was an opportunity to complement it and address the above limitations where the circumstances call for it.</p><h1 id="1518" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Experimentation Reporting Framework (ERF)</h1><p id="ebd9" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">A/B testing has long been a fixture in product development at Airbnb. While sharing some qualities with ACA in counterfactual analysis, A/B testing has focused on determining whether a new feature improves business outcomes, versus determining whether that feature causes a system regression. Over the years Airbnb has developed our Experimentation Reporting Framework (ERF) to run hundreds of concurrent A/B experiments across a half dozen platforms to determine whether a new feature will have a positive impact.</p><p id="297a" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">ERF addresses the limitations of ACA listed above:</p><ul class=""><li id="86cb" class="mv mw ja kr b ks kt kv kw ky mx lc my lg mz lk na nb nc nd gc"><strong class="kr jb">Channels: </strong>With each new platform, an ERF client has been introduced to support A/B testing on it. This includes mobile, web, and backend microservices. APIs were also introduced to provide config systems an avenue to treat config changes as A/B tests.</li><li id="9474" class="mv mw ja kr b ks ne kv nf ky ng lc nh lg ni lk na nb nc nd gc"><strong class="kr jb">End-to-end business metrics: </strong>ERF is driven <em class="nj">primarily</em> by end-to-end business metrics. On the technical side, it randomizes by user, not request, and it is able to distinguish the impact of hundreds of experiments running concurrently. ERF taps into Airbnb’s <a class="au ll" rel="noopener" href="https://medium.com/airbnb-engineering/airbnb-metric-computation-with-minerva-part-2-9afe6695b486">central metrics system</a> to access the thousands of business metrics and dimensions Product and Business teams have defined to measure what matters most to Airbnb overall.</li><li id="e729" class="mv mw ja kr b ks ne kv nf ky ng lc nh lg ni lk na nb nc nd gc"><strong class="kr jb">Granularity: </strong>Where Spinnaker’s ACA runs its experiments at the level of an entire microservice, ERF runs its experiments based on what are basically feature flags embedded into the code. Thus, if multiple features are being developed concurrently in the same microservice, ERF can determine which one is impacting the business metrics.</li></ul><p id="6dca" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">The above characteristics of ERF address the limitations of ACA, but ERF also had a limitation of its own: it was a daily-batch system generating interactive reports intended to be consumed by human decision makers. To address the limitation of Spinnaker’s ACA, ERF needed to evolve into a near real-time system that can directly control the deployment process without human intervention.</p><figure class="ln lo lp lq gz lr gn go paragraph-image"><div role="button" tabindex="0" class="ls lt dq lu cf lv"><div class="gn go nk"><img alt="" class="cf lw lx" src="https://miro.medium.com/max/1400/0*rOQST4-KYXGe253y" width="700" height="297" role="presentation" /></div></div><figcaption class="nl bm gp gn go nm nn bn b bo bp co">Figure 1: Areas of the ERF Platform augmented to support near real-time experimentation</figcaption></figure><p id="2e51" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">This evolution had implications on both the data science behind ERF, and its software architecture. We describe the former in this post, and will describe the latter in the next post of this series.</p><h1 id="9b8c" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Realtime ERF — The Data Science</h1><p id="dd9d" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">The foundation of solid data science is solid data engineering. On the data engineering side, we needed to revisit the definitions of the business metrics to be computed in real-time. The metrics computed by the batch ERF system were designed for accuracy, and could take advantage of complex joins and pre-processing to achieve this. Near real-time metrics did not have this luxury, and required simplification to meet low latency requirements.</p><p id="b92c" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Not only did we have to build new metrics, but we knew we would have to build new statistical tests as well. It is imperative for safe deployment systems to not be noisy, otherwise people will stop using it. Traditional methods like T-Test suffer from a variety of issues that would be extremely problematic when implemented in a real-time system. Two issues in particular are false positives due to (1) <a class="au ll" href="http://library.usc.edu.ph/ACM/KKD%202017/pdfs/p1517.pdf" rel="noopener ugc nofollow" target="_blank">peeking</a> (looking before a predetermined amount of time) and (2) heavily skewed data.</p><p id="5a86" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">When monitoring whether or not a metric has changed in real-time, users want to be notified as soon as the model has the confidence that this is true. However, doing so naively results in the first issue, peeking. In traditional A/B testing, the statistical test is only applied once after a predetermined time, because there is a chance that a significant result is due to randomness and not an actual effect. For real-time ERF, we aren’t making just one test, since, depending on how long we wait to take the test, we’re at risk for either taking too long to detect some errors, or missing other errors that take longer to surface. Instead, we want to check (peek at) the model every 5 minutes so that we can react quickly. With a p-value of 0.05 running 100 A/A comparisons, one could expect to have ~5 significant results that are actually false positives. We can transfer this issue to computing p-values on the same data set multiple times. Each evaluation results in a 5% chance of a false positive and so over multiple evaluations, the chance of having 1 or more false positives approaches 100%.</p><figure class="ln lo lp lq gz lr gn go paragraph-image"><div role="button" tabindex="0" class="ls lt dq lu cf lv"><div class="gn go no"><img alt="" class="cf lw lx" src="https://miro.medium.com/max/1400/0*GHdgh0pCLSG87czD" width="700" height="429" role="presentation" /></div></div><figcaption class="nl bm gp gn go nm nn bn b bo bp co">Figure 2: Increasing evaluations inevitably lead to false positives</figcaption></figure><p id="2cef" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">To balance early detection without noisiness, we utilize <a class="au ll" href="https://en.wikipedia.org/wiki/Sequential_analysis" rel="noopener ugc nofollow" target="_blank">sequential analysis</a>. Sequential methods do not assume a fixed sample size (i.e., checking the model once) and allow us to continually monitor a metric without worrying about false positives incurred due to peeking. One way to correct for false positives (<a class="au ll" href="https://en.wikipedia.org/wiki/Type_I_and_type_II_errors#Type_I_error" rel="noopener ugc nofollow" target="_blank">Type 1 Errors</a>) is by applying a <a class="au ll" href="https://en.wikipedia.org/wiki/Bonferroni_correction" rel="noopener ugc nofollow" target="_blank">Bonferroni correction</a>. If you check your model for statistical significance four times and want to guarantee a 5% overall false positive rate, you need to divide your p-value by four, meaning only results with p-value at or under 1.25% are valid. However, doing so is too conservative since each check is dependent. They are dependent because each check has the same base of data only adding additional observations as time goes on. Sequential models take this dependence into account while guaranteeing false positives rates more efficiently than Bonferroni. We use two different sequential models, SSRM (Sequential Sample Ratio Mismatch) for count metrics, and <a class="au ll" href="https://arxiv.org/abs/1906.09712" rel="noopener ugc nofollow" target="_blank">Sequential Quantiles</a> (Howard, Ramdas) for quantile metrics.</p><p id="177d" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">The second issue that we needed to solve in order to be robust is handling skewed data. Performance metrics like latency can have extremely heavy tails. Models that assume a normal distribution won’t be effective because the Central Limit Theorem does not come into effect. By applying Sequential Quantiles, we can ignore assumptions about the metric’s distribution and directly measure the difference between arbitrary quantiles.</p><figure class="ln lo lp lq gz lr gn go paragraph-image"><div role="button" tabindex="0" class="ls lt dq lu cf lv"><div class="gn go np"><img alt="" class="cf lw lx" src="https://miro.medium.com/max/1400/0*_UUUc7wmcL3jUKei" width="700" height="438" role="presentation" /></div></div><figcaption class="nl bm gp gn go nm nn bn b bo bp co">Figure 3: Metrics may have non-normal distributions</figcaption></figure><p id="7f42" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Lastly, many important measures are not independent. Metrics like latency and impressions have within-user correlation, so each event in the data cannot be treated as an independent unit. In order to counteract skew, we aggregate all measures into user metrics first before evaluating statistical models.</p><h1 id="4a39" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Conclusion</h1><p id="04d3" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">With the statistical methods in place to evaluate business metrics in near real-time, we could now detect problems that were invisible to Spinnaker, or required too much lead time to rely on traditional ERF experiments.</p><figure class="ln lo lp lq gz lr gn go paragraph-image"><div role="button" tabindex="0" class="ls lt dq lu cf lv"><div class="gn go nk"><img alt="" class="cf lw lx" src="https://miro.medium.com/max/1400/0*nmC6mnUBfe98hcxo" width="700" height="301" role="presentation" /></div></div><figcaption class="nl bm gp gn go nm nn bn b bo bp co">Figure 4: How Real-time ERF fits between Spinnaker and Traditional ERF</figcaption></figure><p id="24f6" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Operationalizing the newly created near real-time metrics and statistical methods required further engineering, but more challenging, it required changing the experimentation culture at Airbnb. In the following post we will detail how our near real-time metrics pipeline was built, how these metrics powered automated decision making, and how we drove adoption across the company.</p><h1 id="b080" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Appreciations</h1><p id="f56d" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">Thanks to <a class="au ll" href="https://www.linkedin.com/in/adriankuhn/" rel="noopener ugc nofollow" target="_blank">Adrian Kuhn</a>, <a class="au ll" href="https://www.linkedin.com/in/alex-shaojie-deng-b572347/" rel="noopener ugc nofollow" target="_blank">Alex Deng</a>, <a class="au ll" href="https://www.linkedin.com/in/antoinecreux/" rel="noopener ugc nofollow" target="_blank">Antoine Creux</a>, <a class="au ll" href="https://www.linkedin.com/in/erikriverson/" rel="noopener ugc nofollow" target="_blank">Erik Iverson</a>, <a class="au ll" href="https://www.linkedin.com/in/george-l-9b946655/" rel="noopener ugc nofollow" target="_blank">George Li</a>, <a class="au ll" href="https://www.linkedin.com/in/krishna-bhupatiraju-1ba1a524/" rel="noopener ugc nofollow" target="_blank">Krishna Bhupatiraju</a>, <a class="au ll" href="https://www.linkedin.com/in/preetiramasamy/" rel="noopener ugc nofollow" target="_blank">Preeti Ramasamy</a>, <a class="au ll" href="https://www.linkedin.com/in/rstata/" rel="noopener ugc nofollow" target="_blank">Raymie Stata</a>, Reid Andersen, <a class="au ll" href="https://www.linkedin.com/in/ronnyk/" rel="noopener ugc nofollow" target="_blank">Ronny Kohavi</a>, <a class="au ll" href="https://www.linkedin.com/in/shao-xie-0b84b64/" rel="noopener ugc nofollow" target="_blank">Shao Xie</a>, <a class="au ll" href="https://www.linkedin.com/in/tatiana-xifara/" rel="noopener ugc nofollow" target="_blank">Tatiana Xifara</a>, <a class="au ll" href="https://www.linkedin.com/in/vincent-chan-70080423/" rel="noopener ugc nofollow" target="_blank">Vincent Chan</a>, <a class="au ll" href="https://www.linkedin.com/in/xin-tu/" rel="noopener ugc nofollow" target="_blank">Xin Tu</a> and the OMNI team.</p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/how-airbnb-safeguards-changes-in-production-9fc9024f3446</link>
      <guid>https://medium.com/airbnb-engineering/how-airbnb-safeguards-changes-in-production-9fc9024f3446</guid>
      <pubDate>Mon, 11 Jul 2022 19:02:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[T-LEAF: Taxonomy Learning and EvaluAtion Framework]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@cenzhao06?source=post_page-----30ae19ce8c52--------------------------------"><div class="l dq"><img alt="Cen(Mia) Zhao" class="l ci fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/0*zVFuKKuy_ON_RMwr." width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gc"><div class="hd o he"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@cenzhao06?source=post_page-----30ae19ce8c52--------------------------------">Cen(Mia) Zhao</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bn b bo bp co">Jun 23</p><div class="hw cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">10 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div><div class="cl ie"><div></div></div></div><div class="if ig ih j i d"><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><p id="57fb" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><strong class="jz jb">How we applied qualitative learning, human labeling and machine learning to iteratively develop Airbnb’s Community Support Taxonomy.</strong></p><p id="c7e9" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><strong class="jz jb">By:</strong> <a class="au kv" rel="noopener" href="https://medium.com/@cenzhao06">Mia Zhao,</a> <a class="au kv" href="https://www.linkedin.com/in/peggyshao" rel="noopener ugc nofollow" target="_blank">Peggy Shao</a>, <a class="au kv" href="https://www.linkedin.com/in/maggiekhanson/" rel="noopener ugc nofollow" target="_blank">Maggie Hanson</a>, <a class="au kv" rel="noopener" href="https://medium.com/@wangpengcqb">Peng Wang</a>, <a class="au kv" href="https://www.linkedin.com/in/bo-zeng-71915624" rel="noopener ugc nofollow" target="_blank">Bo Zeng</a></p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*X863W9ZDWmr7SlKW" width="700" height="467" role="presentation" /></div></div></figure><h1 id="f063" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Background</h1><p id="add1" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Taxonomies are knowledge organization systems used to classify and organize information. Taxonomies use words to describe things — as opposed to numbers or symbols — and hierarchies to group things into categories. The structure of a taxonomy expresses how those things relate to each other. For instance, a <em class="ml">Superhost</em> is a type of <em class="ml">Host</em> and a <em class="ml">Host</em> is a type of Airbnb <em class="ml">User</em>. Taxonomies provide vital terminology control and enable downstream systems to navigate information and analyze consistent, structured data.</p><p id="66a4" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Airbnb uses taxonomies in front-end products to help guests and hosts discover exciting stays or experiences, as well as inspirational content and customer support offerings. Airbnb also uses taxonomies in backstage tooling to structure data, organize internal information, and support machine learning applications.</p><p id="4db1" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Classifying the types of issues Airbnb community members face is vital for several reasons:</p><ul class=""><li id="da39" class="mm mn ja jz b ka kb ke kf ki mo km mp kq mq ku mr ms mt mu gc"><strong class="jz jb">Hosts and guests</strong> need to be able to describe issues to Airbnb in order to receive relevant help suggestions or get connected with the best support.</li><li id="0092" class="mm mn ja jz b ka mv ke mw ki mx km my kq mz ku mr ms mt mu gc"><strong class="jz jb">Support Ambassadors</strong> (Airbnb’s Community Support specialists) need quick and easy access to workflows that help them resolve issues for guests and Hosts.</li><li id="5459" class="mm mn ja jz b ka mv ke mw ki mx km my kq mz ku mr ms mt mu gc"><strong class="jz jb">Airbnb business units</strong> need to understand where and why guests and Hosts encounter problems so that we can improve our product and make the Airbnb experience better.</li></ul><p id="5ad7" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The Contact Reasons taxonomy is a new, consolidated issue taxonomy that supports all of these use cases. Before Contact Reasons, Community Support had siloed taxonomies for guests and Hosts, Support Ambassadors, and machine learning models that each used different words and structures to classify the same issues and relied on manual mapping efforts to keep in sync.</p><p id="d899" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The consolidation of disjointed issue taxonomies into Contact Reasons was the first project of its kind at Airbnb. The development of such a new taxonomy requires iterative learning: create/revise the taxonomy by taxonomists; roll out to train ML model, product and services; evaluate the quality of the taxonomy and identify areas for improvement. Before this work, there was no systematic process in place to evaluate taxonomy development or performance and the iteration was mostly subjective and qualitative. To accelerate the iterative development with more quantitative and objective evaluation of the quality of the taxonomy, we created T-LEAF, a <strong class="jz jb"><em class="ml">T</em></strong><em class="ml">axonomy </em><strong class="jz jb"><em class="ml">L</em></strong><em class="ml">earning and </em><strong class="jz jb"><em class="ml">E</em></strong><em class="ml">valu</em><strong class="jz jb"><em class="ml">A</em></strong><em class="ml">tion </em><strong class="jz jb"><em class="ml">F</em></strong><em class="ml">ramework</em>, to quantitatively evaluate taxonomy from three perspectives: coverage, usefulness, and agreement.</p><h1 id="1b05" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Challenges in Evaluating the New Taxonomy</h1><p id="be0b" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">In the Airbnb Community Support domain, new taxonomies or taxonomy nodes often need to be created before we have either real-world data or clear downstream workflow applications. Without a consistent quantitative evaluation framework to generate input metrics, it’s difficult to gauge the quality of a new taxonomy (or a taxonomy version) when directly applying it to downstream applications.</p><h1 id="7ade" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Lack of quantitative evaluation framework</h1><p id="b95c" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Taxonomies are typically developed by qualitative-centric approaches¹. When we started prototyping the new taxonomy, we evaluated feedback from existing users, and recruited guests and Hosts for several rounds of user research to generate insights. While qualitative evaluation like domain expert review is helpful in identifying high-level challenges and opportunities, it is insufficient for providing evaluation at scale, due to small sample sizes and potential sample bias from users participating in the research.</p><h1 id="988d" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Lengthy and iterative product cycle for taxonomy launches</h1><p id="f520" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Developing and launching a taxonomy can be a lengthy and iterative process that requires several quarters of use to get substantive and reliable quantitative feedback. A typical process includes:</p><ul class=""><li id="681b" class="mm mn ja jz b ka kb ke kf ki mo km mp kq mq ku mr ms mt mu gc"><strong class="jz jb">Taxonomy discovery and development</strong> based on product requirement or data-driven analysis</li><li id="abb5" class="mm mn ja jz b ka mv ke mw ki mx km my kq mz ku mr ms mt mu gc"><strong class="jz jb">Production changes</strong> to integrate backend environments and frontend surfaces, including necessary design and content updates</li><li id="3612" class="mm mn ja jz b ka mv ke mw ki mx km my kq mz ku mr ms mt mu gc"><strong class="jz jb">ML model</strong> (re)label training data, retraining, and deployment</li><li id="f59a" class="mm mn ja jz b ka mv ke mw ki mx km my kq mz ku mr ms mt mu gc"><strong class="jz jb">Logging and data analysis on user feedback</strong></li></ul><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go na"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*2xY0JS_w7ZTjcm4Q" width="700" height="112" role="presentation" /></div></div><figcaption class="nb bm gp gn go nc nd bn b bo bp co">Figure 1. Typical taxonomy development iteration cycle.</figcaption></figure><p id="c2bd" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Before T-LEAF, the taxonomy development process relied solely on output metrics to measure the effectiveness of a new taxonomy, which means that: 1) major changes take a long time to experiment and test; and 2) minor changes like adding or updating new nodes aren’t tested. These two pain points can be addressed with the T-LEAF framework by consistent and periodic scoring.</p><p id="1fa0" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">T-LEAF has been developed to include more quantitative evaluation in the taxonomy development and address the above mentioned two pain points to accelerate the taxonomy development iteration.</p><h1 id="12d8" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Taxonomy Learning and EvaluAtion Framework (T-LEAF)</h1><h1 id="3f34" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Quality of a Taxonomy</h1><p id="614b" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">T-LEAF framework measures the quality of a taxonomy in three aspects: 1) coverage, 2) usefulness and 3) agreement.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go ne"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*Sab-WsEGFK1FXkDTM4AbEA.png" width="700" height="363" role="presentation" /></div></div><figcaption class="nb bm gp gn go nc nd bn b bo bp co">Figure 2. T-LEAF Structure</figcaption></figure><h1 id="11d6" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Coverage</h1><p id="5ad9" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Coverage indicates how well a taxonomy can classify the scope of real-world objects. In Contact Reasons, coverage score evaluates how well the taxonomy captures the reasons guests and Hosts contact Airbnb’s Community Support team. When ‘coverage’ is low, a lot of user issues (data objects) will not be covered by the taxonomy and become ‘Other’ or ‘Unknown’.</p><blockquote class="nf ng nh"><p id="f95f" class="jx jy ml jz b ka kb kc kd ke kf kg kh ni kj kk kl nj kn ko kp nk kr ks kt ku it gc">Coverage Score = 1 - percentage of data classified as “other” or “undefined.”</p></blockquote><h1 id="f4fb" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Usefulness</h1><p id="c36a" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Usefulness shows how evenly objects distribute across the structure of the taxonomy into meaningful categories. If a taxonomy is too coarse, i.e., has too few nodes or categories, the limited number of options may not adequately distinguish between the objects that are being described. On the other hand, if a taxonomy is too granular, it may fail to explain similarities between objects.</p><p id="73a4" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In T-LEAF, for a benchmark dataset with n examples (e.g., distinct user issues), we hypothesize that a taxonomy with sqrt(n) number of nodes² gives a good balance between ‘too coarse’ and ‘too granular’. For any input <em class="ml">x</em>, we compute a split score from (0,1] to evaluate the ‘usefulness’:</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go nl"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*z45U60-CPSN1R5jdN76BIQ.png" width="700" height="144" role="presentation" /></div></div></figure><p id="3ae3" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">We want to evaluate the data deviation by assuming the normal distribution. For example, with 100 distinct user issues, if we split into 1 (‘too coarse’) or 100 categories (‘too granular’), the usefulness score would be close to 0; if we split into 10 categories, the usefulness score would be 1.</p><h1 id="48ce" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Agreement</h1><p id="2469" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Agreement captures the inter-rater reliability given the taxonomy. We propose two ways to evaluate agreement.</p><h2 id="dfc5" class="nm lj ja bn lk nn no np lo nq nr ns ls ki nt nu lw km nv nw ma kq nx ny me nz gc">Human Label Inter-rater Agreement</h2><p id="25c9" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Multiple human annotators annotate the same data according to the taxonomy definition and we calculate the inter-rater reliabilities using Cohen’s Keppa in the range of [-1, 1]:</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go oa"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*GacnudDKA1g-QVKLeGsiKw.png" width="700" height="89" role="presentation" /></div></div></figure><h2 id="e97e" class="nm lj ja bn lk nn no np lo nq nr ns ls ki nt nu lw km nv nw ma kq nx ny me nz gc">ML Model Training Accuracy</h2><p id="98fe" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Having multiple human raters annotate one data set can be expensive. In reality, most data is annotated by just one human. In Airbnb’s Community Support, each customer issue/ticket is processed by one agent and agents label the ticket’s issue type based on the taxonomy. We train a ML model based on this single-rater labeled training data and then apply the model over the training data to measure the training accuracy. If the taxonomy is well defined (i.e., with high ‘agreement’), then similar issues (data points) should have similar labels even though these labels come from different agents. ML models trained over highly agreed(consistent) training dataset should have high training accuracy.</p><p id="63cc" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">We have done experiments comparing the multi-label inter-rater agreement approach and ML training accuracy over single-rated training data.</p><p id="a037" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Results are shown in Table 1. We observed that for both methods: 1) accuracies were similar for the top two levels of the taxonomy (L1 and L2 issues are defined in the next section) and; 2) there were similar areas of confusion in both approaches. If taxonomy nodes are clear enough for humans to perform tagging, the consistency rate increases and the model can better capture human intent. The opposite is also true; model training accuracy is negatively impacted if end users are confused by options or unable to choose proper categories.</p><p id="c656" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">It took 1 analyst and 9 annotators about a month to create the multi-rater dataset. In contrast, it took one ML engineer a day to train a ML model over the single-rated data and calculate the training accuracy. As shown in Table 1, ML Training accuracy provides a similar evaluation of taxonomy’s ‘agreement’ quality.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go ob"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*YgpIzpXbhVKc9-GwRrljSQ.png" width="700" height="329" role="presentation" /></div></div><figcaption class="nb bm gp gn go nc nd bn b bo bp co">Table 1. Comparison between multi-rater labeling approach and ML-model over single-rater training data.</figcaption></figure><h1 id="19d0" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Developing the Contact Reason Taxonomy using T-LEAF</h1><p id="8dd6" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">The Contact Reasons taxonomy consists of nearly 200 nodes, spread across a hierarchy that goes from broad categories in Level 1 (L1) to narrower categories in Level 2 (L2) to specific issues in Level 3 (L3). For example:</p><ul class=""><li id="60d5" class="mm mn ja jz b ka kb ke kf ki mo km mp kq mq ku mr ms mt mu gc">Problems with your reservation (L1)</li><li id="df50" class="mm mn ja jz b ka mv ke mw ki mx km my kq mz ku mr ms mt mu gc">Cleanliness and health concerns (L2)</li><li id="34b8" class="mm mn ja jz b ka mv ke mw ki mx km my kq mz ku mr ms mt mu gc">Smoke or other odors in listing (L3)</li></ul><p id="d479" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">While the old taxonomy had unpredictable levels of granularity, depending on the section, Contact Reasons has a consistent, three-level structure that better supports our continuous evaluation framework. We utilized T-LEAF in the transition from the old taxonomy to the new taxonomy (Contact Reasons) to enable a faster feedback loop and provide a quantified quality control before launching the new taxonomy into production environments (Figure 3).</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*jT-HinGVLNec_9sG" width="700" height="145" role="presentation" /></div></div><figcaption class="nb bm gp gn go nc nd bn b bo bp co">Figure 3. Iterative process of taxonomy development, evaluation, and deployment with T-LEAF.</figcaption></figure><p id="ede3" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">First, we sent a real-world dataset to Airbnb Community Support Labs (CS Labs) — a group of skilled and tenured Support Ambassadors — for human annotation. Then, we used T-LEAF scores as an input to the taxonomy development process. Using that input,the Core Machine Learning (CoreML) Engineering team and the Taxonomy team collaborated to significantly improve T-LEAF scores before running experiments in production.</p><p id="1418" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">To evaluate the Contact Reasons taxonomy in one of these production environments, we reviewed its performance in Airbnb bot³. Airbnb bot is one of Community Support’s core products that helps guests and Hosts self-solve issues and connect to Support Ambassadors when necessary. We found that the improvements to the Contact Reason taxonomy as measured by T-LEAF’s metrics of coverage, usefulness, and agreement also translated to actual improvements in issue coverage, self-solve effectiveness, and issue prediction accuracy.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go oc"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*JbLYJHtVd1E3AWCCtMPDnQ.png" width="700" height="246" role="presentation" /></div></div><figcaption class="nb bm gp gn go nc nd bn b bo bp co">Table 2. T-LEAF scores between old and new taxonomies</figcaption></figure><h1 id="c175" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">A higher T-LEAF coverage score leads to greater issue coverage in production</h1><p id="acd7" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">After launching the Contact Reasons taxonomy, we examined 4-months of production data and found that 1.45% of issues were labeled “It’s something else,” which is 5.8% less than the old taxonomy. This is consistent with T-LEAF coverage score improvement (5.3% more coverage than the previous version).</p><h1 id="5bf8" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">A higher usefulness score leads to more issues being resolved through self-service</h1><p id="8c0e" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">For example, in the new taxonomy, there are two new nodes called “<em class="ml">Cancellations and refunds &gt; Canceling a reservation you booked &gt; Helping a Host with a cancellation</em>” and “<em class="ml">Cancellations and refunds &gt; Canceling a reservation you’re hosting &gt; Helping a guest with a cancellation.</em>” The old taxonomy only have nodes for “<em class="ml">Reservations &gt; Cancellations &gt; Host-initiated</em>” and “<em class="ml">Reservations &gt; Cancellations &gt;Guest-initiated</em>”, which did not have granularity to determine when the guest or Host seeking support is not the one requesting the cancellation.</p><p id="9535" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">With the new nodes, we developed a machine learning model that drives traffic to tailored cancellation workflows⁴. This ensures that guests receive the appropriate refund and Host cancellation penalties are applied only when relevant, all without needing to contact Airbnb Support Ambassadors.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go od"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*h9c4FN7fDUPhXet6R-B4ow.png" width="700" height="221" role="presentation" /></div></div><figcaption class="nb bm gp gn go nc nd bn b bo bp co">Figure 4. Airbnb Chatbot self-solve solutions</figcaption></figure><h1 id="2a97" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">A higher T-LEAF agreement score results in more accurate issue prediction</h1><p id="25cd" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Compared to issue prediction models built on the old taxonomy, the model built on the new taxonomy has improved accuracy by<strong class="jz jb"> 9%.</strong> This means that the category the ML model predicts for an issue is more likely to match the category selected by the Support Ambassador.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go oe"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*2FMAyCXyP75We2ku" width="700" height="423" role="presentation" /></div></div><figcaption class="nb bm gp gn go nc nd bn b bo bp co">Figure 5. User/Agent and ML Model Agreement</figcaption></figure><h1 id="98cc" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Conclusion</h1><p id="eb38" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">A quantitative framework to evaluate taxonomy supports faster iterations and reduces the risk of launching major taxonomy transformations, which has positive impacts for all of our audiences: guests, Hosts, Support Ambassadors, and Airbnb businesses. The T-LEAF framework that scores the quality of taxonomy in the aspects of coverage, usefulness, agreement, has now been applied to a production taxonomy in Community Support and results show that using this methodology for quantitative taxonomy evaluation can lead to better model performance and larger issue coverage.</p><p id="4dbe" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Developing, piloting, and establishing T-LEAF as part of our continuous improvement framework for taxonomy evolution has been a collaborative effort across teams. The CoreML team partnered closely with Taxonomy, Product, and CS Labs to create this new model for iterative development of issue categorization and prediction. Having piloted this new way of working on Contact Reasons, we’re confident we’ll see more positive results as we continue to apply the T-LEAF methodology to future taxonomy initiatives</p><p id="a03a" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">[1]: Szopinski, D., Schoormann, T., &amp; Kundisch, D. (2019). Because Your Taxonomy is Worth IT: towards a Framework for Taxonomy Evaluation. <em class="ml">ECIS</em>. <a class="au kv" href="https://aisel.aisnet.org/ecis2019_rp/104/" rel="noopener ugc nofollow" target="_blank">https://aisel.aisnet.org/ecis2019_rp/104/</a></p><p id="e060" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">[2]: Carlis, J., &amp; Bruso, K. (2012). RSQRT: AN HEURISTIC FOR ESTIMATING THE NUMBER OF CLUSTERS TO REPORT. Electronic commerce research and applications, 11(2), 152–158. <a class="au kv" href="https://doi.org/10.1016/j.elerap.2011.12.006" rel="noopener ugc nofollow" target="_blank">https://doi.org/10.1016/j.elerap.2011.12.006</a></p><p id="815d" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">[3]: Intelligent Automation Platform: Empowering Conversational AI and Beyond at Airbnb. <a class="au kv" rel="noopener" href="https://medium.com/airbnb-engineering/intelligent-automation-platform-empowering-conversational-ai-and-beyond-at-airbnb-869c44833ff2">https://medium.com/airbnb-engineering/intelligent-automation-platform-empowering-conversational-ai-and-beyond-at-airbnb-869c44833ff2</a></p><p id="d067" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">[4]: Task-Oriented Conversational AI in Airbnb Customer Support. <a class="au kv" rel="noopener" href="https://medium.com/airbnb-engineering/task-oriented-conversational-ai-in-airbnb-customer-support-5ebf49169eaa">https://medium.com/airbnb-engineering/task-oriented-conversational-ai-in-airbnb-customer-support-5ebf49169eaa</a></p><h1 id="53b0" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Acknowledgments</h1><p id="761e" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Thanks to CS Labs for labeling support on existing and new taxonomies!</p><p id="fa02" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Thanks to Pratik Shah, Rachel Lang, Dexter Dilla, Shuo Zhang, Zhiheng Xu, Alex Zhou, Wayne Zhang, Zhenyu Zhao, Jerry Hong, Gavin Li, Kristen Jaber, Aliza Hochsztein, Naixin Zhang, Gina Groom, Robin Foyle, Parag Hardas, Zhiying Gu, Kevin Jungmeisteris, Jonathan Li-On Wing, Danielle Martin, Bill Selman, Hwanghah Jeong, Stanley Wong, Lindsey Oben, Chris Enzaldo, Jijo George, Ravish Gadhwal, and Ben Ma for supporting our successful CS taxonomy launch and workflow related applications!</p><p id="8166" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Thank Joy Zhang, Andy Yasutake, Jerry Hong, Lianghao Li, Susan Stevens, Evelyn Shen, Axelle Vivien, Lauren Mackevich, Cynthia Garda, for reviewing, editing and making great suggestions to the blog post!</p><p id="c6e0" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Last but not least, we appreciate Joy Zhang, Andy Yasutake, Raj Rajagopal, Tina Su and Cynthia Garda for leadership support!</p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/t-leaf-taxonomy-learning-and-evaluation-framework-30ae19ce8c52</link>
      <guid>https://medium.com/airbnb-engineering/t-leaf-taxonomy-learning-and-evaluation-framework-30ae19ce8c52</guid>
      <pubDate>Thu, 23 Jun 2022 19:20:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Airbnb’s Trip to Linaria]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@lencioni?source=post_page-----dc169230bd12--------------------------------"><div class="l dq"><img alt="Joe Lencioni" class="l ci fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/1*r4bT1s_VG5WFqtCX5M-2lA.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gc"><div class="hd o he"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@lencioni?source=post_page-----dc169230bd12--------------------------------">Joe Lencioni</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bn b bo bp co">Jun 16</p><div class="hw cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">11 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div><div class="cl ie"><div></div></div></div><div class="if ig ih j i d"><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><div class=""><h2 id="0d0b" class="pw-subtitle-paragraph jx iz ja bn b jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko co">Learn how Linaria, Airbnb’s newest choice for web styling, improved both developer experience and web performance</h2></div><figure class="kq kr ks kt gz ku gn go paragraph-image"><div role="button" tabindex="0" class="kv kw dq kx cf ky"><div class="gn go kp"><img alt="" class="cf kz la" src="https://miro.medium.com/max/1400/1*-qT4pQIPIsxHBZj22sQtag.jpeg" width="700" height="468" role="presentation" /></div></div></figure><p id="655f" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">CSS is a critical component of every web application, and many solutions have evolved for how styles are written by developers and delivered to visitors. In this post we’ll take you through Airbnb’s journey from Sass to CSS-in-JS and show you why we landed on <a class="au lx" href="https://github.com/callstack/linaria" rel="noopener ugc nofollow" target="_blank">Linaria, a zero-runtime CSS-in-JS library</a>, and the impact it has had on the developer experience and performance of Airbnb’s web app.</p><h1 id="77ce" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">From Sass to CSS-in-JS</h1><p id="e1c6" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">In 2016, our web frontend was in a monolithic <a class="au lx" href="https://rubyonrails.org/" rel="noopener ugc nofollow" target="_blank">Ruby on Rails</a> app using a combination of <a class="au lx" href="https://github.com/rails/sprockets" rel="noopener ugc nofollow" target="_blank">Sprockets</a>, <a class="au lx" href="https://browserify.org/" rel="noopener ugc nofollow" target="_blank">Browserify</a>, and <a class="au lx" href="https://sass-lang.com/" rel="noopener ugc nofollow" target="_blank">Sass</a>. We had a <a class="au lx" href="https://getbootstrap.com/" rel="noopener ugc nofollow" target="_blank">Bootstrap</a>-inspired internal toolkit for styling, but we weren’t using anything like <a class="au lx" href="https://github.com/css-modules/css-modules" rel="noopener ugc nofollow" target="_blank">CSS Modules</a> or <a class="au lx" href="http://getbem.com/" rel="noopener ugc nofollow" target="_blank">BEM</a>.</p><p id="aafb" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Production bugs were often caused by our styling — sometimes the correct stylesheet was missing from some pages and other times styles from different stylesheets conflicted unexpectedly.</p><figure class="kq kr ks kt gz ku"><div class="m l dq"></div></figure><p id="d300" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Additionally, developers <a class="au lx" href="https://css-tricks.com/how-do-you-remove-unused-css-from-a-site/" rel="noopener ugc nofollow" target="_blank">rarely removed styles once added since it was hard to know whether they were still needed</a>. These issues compounded as our product surface area rapidly expanded.</p><p id="bbcf" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">As we began to build our <a class="au lx" href="https://www.youtube.com/watch?v=fHQ1WSx41CA" rel="noopener ugc nofollow" target="_blank">Design System</a> in React, we landed on CSS-in-JS as an exciting new option. At the time, CSS-in-JS was still in its infancy–only a few libraries existed and <a class="au lx" href="https://styled-components.com/" rel="noopener ugc nofollow" target="_blank">Styled Components</a> had not been invented yet. We chose <a class="au lx" href="https://github.com/khan/aphrodite" rel="noopener ugc nofollow" target="_blank">Aphrodite</a>, but didn’t want to be directly coupled to Aphrodite’s implementation for two reasons: since CSS-in-JS was a nascent space we wanted to have the flexibility to switch implementations at a later date, and we also wanted something that would work for open source projects where people might not want Aphrodite. So we created an abstraction layer called <a class="au lx" href="https://github.com/airbnb/react-with-styles" rel="noopener ugc nofollow" target="_blank">react-with-styles</a>, which gave us a <a class="au lx" href="https://reactjs.org/docs/higher-order-components.html" rel="noopener ugc nofollow" target="_blank">higher-order component (HOC)</a> to define themeable styles.</p><figure class="kq kr ks kt gz ku"><div class="m l dq"></div></figure><p id="329e" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">This allowed components to be styled in the same file, making repo organization more convenient. More importantly, <strong class="ld jb">moving from a globally-aware styling system to a component-based styling system gave us guarantees around how styles would be applied and what files were needed to render every component correctly on every page</strong>. This enabled us to rely on <a class="au lx" href="https://happo.io/" rel="noopener ugc nofollow" target="_blank">Happo, our screenshot testing tool of choice</a>, and as a result visual regressions plummeted (disclosure: I am the co-creator of Happo).</p><p id="2c14" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Though react-with-styles has served us well for years, it comes with <strong class="ld jb">performance</strong> and <strong class="ld jb">developer experience</strong> tradeoffs. The styles and runtime libraries increase critical path JS bundle size, and applying styles at render-time comes with a CPU cost (10–20% of our component mount times). While we get the aforementioned guarantees about styles, actually writing styles in JavaScript objects feels awkward compared to regular CSS syntax. These tradeoffs led us to reconsider how we style the web at Airbnb.</p><h1 id="30b1" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Considering Our Options</h1><p id="7b23" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">To address the problems with react-with-styles, we formed a working group of engineers from various teams. We considered a number of directions, which fit into the following high-level categories:</p><ul class=""><li id="adfa" class="my mz ja ld b le lf lh li lk na lo nb ls nc lw nd ne nf ng gc">Static extraction of CSS from react-with-styles at build time</li><li id="761f" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc">Write our own framework</li><li id="f326" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc">Investigate and adopt an existing framework</li></ul><p id="decd" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">We decided against <strong class="ld jb">static extraction</strong> from react-with-styles at build time because it would require a lot of effort. Additionally, it would be home-grown and therefore lack benefits of a community. Finally, it does not address developer ergonomics issues.</p><p id="3a5c" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Similarly, <strong class="ld jb">writing our own framework</strong> would have had a high cost of initial implementation, maintenance, and support. Additionally, there were existing solutions for this problem that we wanted to leverage and contribute back to.</p><figure class="kq kr ks kt gz ku gn go paragraph-image"><div class="gn go nm"><img alt="An xkcd comic titled “How Standards Proliferate: (see: A/C chargers, character encodings, instant messaging, etc). Panel 1: Situation: There are 14 competing standards. Panel 2: “14?! Ridiculous! We need to develop one universal standard that covers everyone’s use cases.” “Yeah!” Panel 3: Soon: Situation: There are 15 competing standards." class="cf kz la" src="https://miro.medium.com/max/1000/0*7Kwk-MuLZOIUhgnv" width="500" height="283" /></div><figcaption class="nn bm gp gn go no np bn b bo bp co">Comic from <a class="au lx" href="https://xkcd.com/927/" rel="noopener ugc nofollow" target="_blank">https://xkcd.com/927/</a> by Randall Munroe and is used under a CC-BY-NC 2.5 license.</figcaption></figure><p id="6666" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">After evaluating several <strong class="ld jb">existing frameworks</strong> against our requirements, we narrowed down candidates for building a proof of concept:</p><ul class=""><li id="0a23" class="my mz ja ld b le lf lh li lk na lo nb ls nc lw nd ne nf ng gc"><a class="au lx" href="https://emotion.sh/docs/introduction" rel="noopener ugc nofollow" target="_blank">Emotion</a>: CSS-in-JS, with a low runtime cost</li><li id="6872" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc"><a class="au lx" href="https://github.com/callstack/linaria" rel="noopener ugc nofollow" target="_blank">Linaria</a>: zero-runtime CSS-in-JS (static CSS extraction)</li><li id="bb46" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc"><a class="au lx" href="https://github.com/seek-oss/treat" rel="noopener ugc nofollow" target="_blank">Treat</a>: near zero-runtime CSS-in-JS (static CSS extraction)</li></ul><p id="da82" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">The proof-of-concepting work was done in a new repo that implemented a server-rendered client-hydrated unstyled version of Airbnb’s logged in homepage. For each framework, this allowed us to:</p><ul class=""><li id="9ee9" class="my mz ja ld b le lf lh li lk na lo nb ls nc lw nd ne nf ng gc">Understand what changes might need to be made to our build system</li><li id="f5b7" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc">Try out framework APIs and get a feel for developer ergonomics</li><li id="d4fe" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc">Assess how each framework supports our web styling requirements</li><li id="092a" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc">Gather performance metrics</li><li id="5935" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc">Serve as a starting point for a migration plan</li></ul><p id="f6cd" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Frameworks were evaluated against each other based on the following ranked list of criteria:</p><ol class=""><li id="7204" class="my mz ja ld b le lf lh li lk na lo nb ls nc lw nq ne nf ng gc"><strong class="ld jb">Performance</strong></li><li id="602e" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nq ne nf ng gc"><strong class="ld jb">Community</strong> (i.e. support and adoption)</li><li id="15c7" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nq ne nf ng gc"><strong class="ld jb">Developer experience</strong></li></ol><h1 id="8b61" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Performance Analysis</h1><p id="4e2e" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">Using <a class="au lx" href="https://www.speedcurve.com/" rel="noopener ugc nofollow" target="_blank">SpeedCurve</a>, local benchmarking, and the <a class="au lx" href="https://reactjs.org/docs/profiler.html" rel="noopener ugc nofollow" target="_blank">React &lt;Profiler /&gt;</a>, we ran performance benchmarking tests for each framework. All results were calculated as the median of 200 runs on a throttled MacBook Pro, and are statistically significantly different from control with a p-value of &lt;= 0.05.</p><p id="d82e" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Informed by <a class="au lx" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">Airbnb’s Page Performance Score</a> (similar to <a class="au lx" href="https://web.dev/performance-scoring/" rel="noopener ugc nofollow" target="_blank">Lighthouse’s performance score</a>), we focused on the following metrics to give us an idea of how each framework performed and would impact the user experience:</p><ul class=""><li id="5d16" class="my mz ja ld b le lf lh li lk na lo nb ls nc lw nd ne nf ng gc"><a class="au lx" href="https://web.dev/tbt/" rel="noopener ugc nofollow" target="_blank">Total blocking time (TBT)</a></li><li id="2cbf" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc">Bundle size</li><li id="1d63" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc">Update layout tree count and duration</li><li id="d74b" class="my mz ja ld b le nh lh ni lk nj lo nk ls nl lw nd ne nf ng gc">Composite layers count and duration</li></ul><figure class="kq kr ks kt gz ku"><div class="m l dq"></div></figure><p id="cf54" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">It is clear that the frameworks are divided into two groups: <strong class="ld jb">runtime frameworks</strong> (react-with-styles, Emotion) and <strong class="ld jb">build-time frameworks</strong> (Linaria, Treat).</p><p id="074d" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Benchmarks of the server-rendered and client-hydrated version of our homepage showed Treat and Linaria performing 36% and 22% better than Emotion on <strong class="ld jb">Total Blocking Time</strong>, respectively. All frameworks performed significantly better than react-with-styles, ranging from a 32–56% improvement. <em class="nr">(Note that these numbers should not be used to estimate expected improvements in production, as this is a very specific benchmark designed to test differences between frameworks, not expected savings in production.)</em></p><p id="62ed" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc"><strong class="ld jb">Bundle size</strong> differences also fall into these two categories — with savings on the order of 80 KiB (~12%) for the Linaria/Treat group.</p><p id="15ba" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">The CSS metrics (<strong class="ld jb">update layout tree</strong> and <strong class="ld jb">composite layers</strong>) show that, on average, there is roughly one more layout tree update and layer composition event for react-with-styles/Emotion. This is likely due to the insertion and hydration of stylesheets with JavaScript that is not necessary with a CSS extraction library like Linaria or Treat.</p><p id="eafb" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">This performance investigation shows that either Linaria or Treat would be promising options to adopt, and that all frameworks considered are a statistically significant improvement over react-with-styles with Aphrodite.</p><h1 id="41a8" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">What We Liked About Linaria</h1><p id="0c96" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">The above <strong class="ld jb">performance</strong> improvements were largely thanks to Linaria extracting the styles from JS to static CSS files at build time, so there is no JS bundle or runtime CPU overhead — giving it a slight edge over the near-zero runtime Treat. Also, this brings caching benefits since these static CSS files may change at a different cadence than the JS files. Since the styles are extracted at build time, Linaria has the opportunity to automatically remove unused styles — this also opens the door to the possibility of deduplicating styles (i.e. <a class="au lx" href="https://css-tricks.com/lets-define-exactly-atomic-css/" rel="noopener ugc nofollow" target="_blank">Atomic CSS</a>). Additionally, Linaria supports injecting the critical CSS for server-side rendering, which we had wanted to preserve from our react-with-styles integration.</p><p id="8fc0" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc"><a class="au lx" href="https://snyk.io/advisor/npm-package/linaria" rel="noopener ugc nofollow" target="_blank">Linaria also seemed to be a healthy project</a> that saw a good amount of activity, <strong class="ld jb">community</strong> involvement, documentation, and adoption. Its good trajectory gave us confidence that it would continue to improve and that we would be able to contribute back.</p><p id="f5f4" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">We found Linaria’s <a class="au lx" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates" rel="noopener ugc nofollow" target="_blank">tagged template literal</a> API that enables developers to use CSS syntax to be an attractive improvement over the JS object HOC API that we built for react-with-styles. Additionally, off-the-shelf integrations were available for stylelint, CSS autocompletion, and syntax highlighting, which enriched the <strong class="ld jb">developer experience</strong>.</p><figure class="kq kr ks kt gz ku gn go paragraph-image"><div role="button" tabindex="0" class="kv kw dq kx cf ky"><div class="gn go ns"><img alt="" class="cf kz la" src="https://miro.medium.com/max/1400/0*hyXFKX-bB-ixvHsE" width="700" height="287" role="presentation" /></div></div><figcaption class="nn bm gp gn go no np bn b bo bp co">Off-the-shelf integrations for stylelint, CSS autocompletion, and syntax highlighting working with Linaria in action.</figcaption></figure><p id="0c17" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">We also found value in the similarities between Linaria and our existing solution. The co-location of styles within the component file was a big feature that tipped the scales in favor of Linaria over Treat for us, and the familiar API smoothed the transition for developers and gave us confidence that migration efforts could be eased with automation.</p><h1 id="e110" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Migration Strategy</h1><p id="32b2" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">To roll out this big change, we adopted an incremental migration strategy that is largely automated by <a class="au lx" rel="noopener" href="https://medium.com/airbnb-engineering/turbocharged-javascript-refactoring-with-codemods-b0cae8b326b9">codemods</a> we’ve written. We are leaning heavily on our <a class="au lx" href="https://happo.io/" rel="noopener ugc nofollow" target="_blank">Happo screenshot tests</a> to ensure that our components look the same after they are migrated. This allows sections of our codebase to be migrated by running a script and following up with any necessary tweaks, similar to <a class="au lx" rel="noopener" href="https://medium.com/airbnb-engineering/ts-migrate-a-tool-for-migrating-to-typescript-at-scale-cd23bfeb5cc">the approach we took when adopting TypeScript</a>.</p><p id="bb4f" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">The first phase of the migration was handled by the web styling working group and targeted converting a subset of components on a few select pages with varying performance characteristics. This phase was gated on A/B tests which ensured that our initial understanding of the performance held up under the specifics of our app and assured us that there were no hidden problems.</p><p id="d71f" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Once we were confident about the performance and correctness of our Linaria integration, we allowed teams to start using Linaria in new code. We also encouraged teams to migrate their existing code using our codemods. Although the migration has proceeded at a good pace organically, we plan to ensure that all code has moved off of react-with-styles so that we can eventually remove the runtime dependencies from the bundles entirely. This consistency will give us an additional performance boost and reduce the cost of <a class="au lx" href="https://en.wikipedia.org/wiki/Decision_fatigue" rel="noopener ugc nofollow" target="_blank">decision fatigue</a>.</p><h1 id="dd3b" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Contributing Back</h1><p id="21d1" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">Once we started using Linaria, we discovered that automatic style deduplication (i.e. Atomic CSS) would give us not just a performance boost, but also would fix some non-performance-related hiccups we ran into.</p><p id="ad78" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">The selectors that Linaria generates are all of the same <a class="au lx" href="https://developer.mozilla.org/en-US/docs/Web/CSS/Specificity" rel="noopener ugc nofollow" target="_blank">specificity</a>. Since CSS selectors of the same specificity depend on their declaration order, the order that the bundler builds these files becomes important. This is problematic when sharing styles between files, since we cannot predict or maintain the order of the styles as the shape of the dependency graph changes.</p><p id="f5c8" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">We initially approached this problem by creating a new <a class="au lx" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates" rel="noopener ugc nofollow" target="_blank">tagged template literal</a> for CSS fragments which allows for the styles to be interpolated into Linaria’s CSS tagged template literals. This works okay, but it is unintuitive, defeats <a class="au lx" href="https://github.com/prettier/prettier/blob/d13feed42b6478710bebbcd3225ab6f203a914c1/src/language-js/embed.js#L90-L121" rel="noopener ugc nofollow" target="_blank">tooling that expects styles to be defined in CSS tagged template literals</a>, and leads to the styles being included several times in the CSS bundles (which is suboptimal for performance).</p><p id="8f7f" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Josh Nelson, a member of our web styling working group, <a class="au lx" href="https://github.com/callstack/linaria/pull/867" rel="noopener ugc nofollow" target="_blank">contributed Atomic CSS support back to Linaria</a> and the Linaria community has been very supportive. The change adds a new <a class="au lx" href="https://npmjs.com/@linaria/atomic" rel="noopener ugc nofollow" target="_blank">@linaria/atomic</a> package that when imported instead of <a class="au lx" href="https://www.npmjs.com/package/@linaria/core" rel="noopener ugc nofollow" target="_blank">@linaria/core</a> will generate Atomic CSS at build time. This means that if you write your code like this:</p><figure class="kq kr ks kt gz ku"><div class="m l dq"></div></figure><p id="eae7" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Instead of generating output like this (without Atomic CSS):</p><figure class="kq kr ks kt gz ku"><div class="m l dq"></div></figure><p id="2c93" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">The generated output will look something like this (with Atomic CSS):</p><figure class="kq kr ks kt gz ku"><div class="m l dq"></div></figure><p id="c27e" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">The order of appearance problem is solved by build time analysis that chains class names based on the order they are passed in to the cx function to increase specificity when necessary.</p><h1 id="e264" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Reception</h1><p id="b2a3" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">Our engineers have reacted positively to Linaria. Here are some quotes:</p><blockquote class="nt nu nv"><p id="1587" class="lb lc nr ld b le lf kb lg lh li ke lj nw ll lm ln nx lp lq lr ny lt lu lv lw it gc">“Linaria opens up a world where we can code like it’s 1999, in old school pure on CSS. It advises against bad patterns, but gives us the flexibility to build amazing experiences. We’re not fighting the platform anymore, we’re harnessing it and it feels incredibly powerful.” — Callie Riggins</p><p id="2dad" class="lb lc nr ld b le lf kb lg lh li ke lj nw ll lm ln nx lp lq lr ny lt lu lv lw it gc">“Compared to react-with-styles, I care more about what I’m creating now. Linaria is so good.” — Ian Demattei-Selby</p><p id="0900" class="lb lc nr ld b le lf kb lg lh li ke lj nw ll lm ln nx lp lq lr ny lt lu lv lw it gc">“I really liked being able to write CSS again. It gives you so much more control over what you can style in the component.” — Brie Bunge</p><p id="7a37" class="lb lc nr ld b le lf kb lg lh li ke lj nw ll lm ln nx lp lq lr ny lt lu lv lw it gc">“It’s great to be writing actual CSS again.” — Victor Lin</p></blockquote><p id="ad61" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Thanks to its familiar CSS syntax, style extraction into static stylesheets, and application of styles using class names, Linaria <strong class="ld jb">increases product development speed</strong> and <strong class="ld jb">unlocks new styling capabilities not possible with react-with-styles and Aphrodite</strong>.</p><h1 id="44a5" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Performance Impact</h1><p id="1067" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">Though we are still at the beginning of our migration, we have run some A/B tests that give us an encouraging look at the real world performance impact of switching to Linaria for a large group of visitors in the wild.</p><p id="456a" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">In one experiment, we converted about 10% of the components rendered on the airbnb.com homepage from react-with-styles to Linaria, and saw Homepage <a class="au lx" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">Page Performance Score</a> improve by 0.26%. <a class="au lx" href="https://web.dev/fcp/" rel="noopener ugc nofollow" target="_blank">Time to First Contentful Paint (TTFCP)</a> improved by 0.54% (mean of 790ms), while <a class="au lx" href="https://web.dev/tbt/" rel="noopener ugc nofollow" target="_blank">Total Blocking Time (TBT)</a> also had a strong improvement of 1.6% (mean of 1200ms). To put this in perspective, hydrating the homepage with React takes around 200ms for most people, so improvements of this order of magnitude are significant. We believe these performance improvements with Linaria are attributable to no longer generating CSS styles at render-time, which improves render times on both server and client.</p><p id="8fcf" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Assuming the performance improvements will scale linearly (which is a big assumption), converting the remaining 90% of the components <em class="nr">might</em> result in a 2.6% improvement to Page Performance Score, 5.4% improvement to Time to First Contentful Paint (TTFCP), and 16% improvement to Total Blocking Time (TBT).</p><p id="971a" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Note that direct comparisons with other industry numbers are a little tricky here, given the different ways we define pages especially with regard to client routing.</p><h1 id="8fd0" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">What Does This Mean for react-with-styles?</h1><p id="449c" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">Given that we still have many components that still depend on react-with-styles and that it will take a while for us to complete our migration, <strong class="ld jb">we will put react-with-styles in maintenance mode</strong> until we approach the end of our migration. At that point, <strong class="ld jb">we intend to sunset react-with-styles</strong> and the related packages.</p><p id="feba" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">By removing an option from the marketplace we hope to help the community coalesce towards a common solution and invest in better frameworks. If you are looking for a new tool, we think Linaria is a great choice!</p><h1 id="cea4" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Conclusion</h1><p id="98ef" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">Styling infrastructure is still an exciting space, rich with opportunities. At Airbnb, we’ve found big improvements to the <strong class="ld jb">developer experience</strong> by adopting a framework that allows regular CSS syntax to be used alongside our React component code. And by replacing a runtime styling library with one that compiles to static CSS files at build time, we are able to continue driving toward faster <strong class="ld jb">performance</strong>. Thanks to the Linaria <strong class="ld jb">community</strong> and our collaboration, we expect this library to continue to improve for many years.</p><p id="0d26" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc">Interested in working at Airbnb? Check out these open roles:</p><p id="4588" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc"><a class="au lx" href="https://grnh.se/ebfa55151us" rel="noopener ugc nofollow" target="_blank">Frontend Infrastructure Engineer, Web Platform</a><br /><a class="au lx" href="https://grnh.se/b5afa9151us" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Data Governance </a><br /><a class="au lx" href="https://grnh.se/92c32fed1us" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Cloud Infrastructure </a><br /><a class="au lx" href="https://grnh.se/bbe55fe81us" rel="noopener ugc nofollow" target="_blank">Staff Database Engineer </a><br /><a class="au lx" href="https://grnh.se/21e5c2011us" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer — ML Ops Platform </a><br /><a class="au lx" href="https://grnh.se/ee114dfc1us" rel="noopener ugc nofollow" target="_blank">Senior/Staff Software Engineer, Service Capabilities</a></p><h1 id="3e5b" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Acknowledgments</h1><p id="2e09" class="pw-post-body-paragraph lb lc ja ld b le mq kb lg lh mr ke lj lk ms lm ln lo mt lq lr ls mu lu lv lw it gc">We have a lot of appreciation for the folks at <a class="au lx" href="https://www.callstack.com/" rel="noopener ugc nofollow" target="_blank">callstack</a> and the <a class="au lx" href="https://github.com/callstack/linaria#contributors" rel="noopener ugc nofollow" target="_blank">Linaria community</a> for building such a great tool and for collaborating with us to make it even better. Also for <a class="au lx" href="https://www.khanacademy.org/" rel="noopener ugc nofollow" target="_blank">Khan Academy</a> for giving us Aphrodite which served us well for many years. This has been a huge effort at Airbnb that would not have been possible without all the work put in by so many people at Airbnb, including Mars Jullian, Josh Nelson, Nora Tarano, Alan Wright, Jimmy Guo, Ian Demattei-Selby, Victor Lin, Nnenna John, Adrianne Soike, Garrett Berg, Andrew Huth, Austin Wood, Chris Sorenson, and Miles Johnson. Finally, thank you to Surashree Kulkarni for help editing this blog post. Thank you all!</p></div><div class="o dz nz oa if ob" role="separator"><div class="it iu iv iw ix"><p id="d1bc" class="pw-post-body-paragraph lb lc ja ld b le lf kb lg lh li ke lj lk ll lm ln lo lp lq lr ls lt lu lv lw it gc"><em class="nr">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/airbnbs-trip-to-linaria-dc169230bd12</link>
      <guid>https://medium.com/airbnb-engineering/airbnbs-trip-to-linaria-dc169230bd12</guid>
      <pubDate>Thu, 16 Jun 2022 19:41:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Graph Machine Learning at Airbnb]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@devins?source=post_page-----f868d65f36ee--------------------------------"><div class="l dq"><img alt="Devin Soni" class="l ci fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/2*1Z-JeVOl6yKM5IwAUOu35Q.png" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gc"><div class="hd o he"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@devins?source=post_page-----f868d65f36ee--------------------------------">Devin Soni</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bn b bo bp co">Jun 14</p><div class="hw cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">10 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div><div class="cl ie"><div></div></div></div><div class="if ig ih j i d"><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><div class=""><h2 id="52fd" class="pw-subtitle-paragraph jx iz ja bn b jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko co"><strong class="ba">How Airbnb is leveraging graph neural networks to up-level our machine learning</strong></h2></div><p id="00f7" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">By:<a class="au ll" href="https://www.linkedin.com/in/devinsoni/" rel="noopener ugc nofollow" target="_blank"> Devin Soni</a></p><figure class="ln lo lp lq gz lr gn go paragraph-image"><div role="button" tabindex="0" class="ls lt dq lu cf lv"><div class="gn go lm"><img alt="" class="cf lw lx" src="https://miro.medium.com/max/1400/1*bEZU2cupMt44ke6mkK-low.jpeg" width="700" height="467" role="presentation" /></div></div></figure><h1 id="56e7" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Introduction</h1><p id="8dad" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">Many real-world machine learning problems can be framed as graph problems. On online platforms, users often share assets (e.g. photos) and interact with each other (e.g. messages, bookings, reviews). These connections between users naturally form edges that can be used to create a graph.</p><p id="4979" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">However, in many cases, machine learning practitioners do not leverage these connections when building machine learning models, and instead treat nodes (in this case, users) as completely independent entities. While this does simplify things, leaving out information around a node’s connections may reduce model performance by ignoring where this node is in the context of the overall graph.</p><p id="ec41" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">In this blog post, we will explain the benefits of using graphs for machine learning, and show how leveraging graph information allows us to learn more about our users, in addition to building more contextual representations of them [4]. We will then cover specific graph machine learning methods, such as Graph Convolutional Networks, that are being used at Airbnb to improve upon existing machine learning models.</p><p id="3a8f" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">The motivating use-case for this work is to build machine learning models that protect our community from harm, but many of the points being made and systems being built are quite generic and could be applied to other tasks as well.</p><h1 id="fafc" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Challenges</h1><h2 id="f92c" class="mv lz ja bn ma mw mx my me mz na nb mi ky nc nd mk lc ne nf mm lg ng nh mo ni gc"><strong class="ba">The problem</strong></h2><p id="b442" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">When building trust &amp; safety machine learning models around entities such as users or listings, we generally begin by reaching for features that directly describe the entity. For example, in the case of users, we may use features such as their location, account age, or number of bookings. However, these simple features do not adequately describe the user in the context of the overall Airbnb platform and their interactions with other users.</p><p id="5e7b" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Consider the hypothetical scenario in which a new host joins Airbnb. A week into their hosting journey, we likely do not have a lot of information about them other than what they have directly told us. This could include their listing’s location or their phone number. These direct attributes given to us by the host are relatively surface level and do not necessarily help us understand their trustworthiness or reputation.</p><p id="ba2b" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">In this state, it is hard for Airbnb to provide this new host with the best possible experience since we do not know what their usage pattern of the platform will be. Lacking information, we might then make this new host go through a slower onboarding process or request a lot of information up-front. Understanding this user’s relationships to the rest of the platform is data we can leverage to provide them with an improved experience.</p><h2 id="9a97" class="mv lz ja bn ma mw mx my me mz na nb mi ky nc nd mk lc ne nf mm lg ng nh mo ni gc"><strong class="ba">An illustration of the usefulness of graphs</strong></h2><p id="3c3d" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">While we do not have much direct information about the new host, what we can do is leverage their surroundings to try and learn more. One example of this is the connections that they have to other users.</p><figure class="ln lo lp lq gz lr gn go paragraph-image"><div role="button" tabindex="0" class="ls lt dq lu cf lv"><div class="gn go nj"><img alt="" class="cf lw lx" src="https://miro.medium.com/max/1346/1*BAJt6yhBPetG4HwwUylGiQ.png" width="673" height="349" role="presentation" /></div></div></figure><p id="245f" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">We can first take a look at their one-hop neighborhood, or in other words, the set of users whom this host has a direct connection with. In this example, we can see that this new host shares a listing photo with an existing, tenured host. We can also see that the new host’s listing is in the same house as listings from three other hosts. With this additional knowledge, we now know more about the new host; they might be working with other hosts who have rooms in the same house. However, we can’t be completely sure how all of the connected hosts relate to each other without looking at more of the graph.</p><figure class="ln lo lp lq gz lr gn go paragraph-image"><div role="button" tabindex="0" class="ls lt dq lu cf lv"><div class="gn go nk"><img alt="" class="cf lw lx" src="https://miro.medium.com/max/1400/1*9u8zd4_qXAA46RGCUWa0kw.png" width="700" height="508" role="presentation" /></div></div></figure><p id="1462" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Let’s further expand our view and consider the two-hop neighborhood of the new host. This expands our view to users who are not necessarily directly connected to the new host. In this expanded network we now see that many of the hosts with listings in the location are connected to each other through a shared business name. It now becomes very likely that this new host is part of an existing group of hosts that rent out rooms in the same house, and has not yet updated their profile to reflect this.</p><p id="562b" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Using the power of graphs, we were able to learn about a new host simply by inspecting their connections to other users within Airbnb. This additional knowledge extracted from the graph provides us with an improved glimpse into who our new host is. We are subsequently able to deliver an improved experience to this new host, all without requiring them to provide any more information to Airbnb.</p><p id="1ba5" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Supplementing our models with graph information is one way to bootstrap our models. Using graphs, we can construct a detailed understanding of our users in scenarios where we have little historical data or observations pertaining to a user. While the semantic information we gain from the graph is often inferred and not directly told to us by the user, it can give us a strong baseline level of knowledge until we have more factual information about a user.</p><h1 id="fa73" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc"><strong class="ba">Graph Machine Learning</strong></h1><p id="390b" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">We have established that we want our machine learning models to be able to ingest graph information. The main challenge is figuring out how best to condense everything a graph can represent into a format that our models can use. Let’s dig into some of the options and explore the solution that we ultimately implemented.</p><p id="e719" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">One simple option is to calculate statistics about nodes and use them as numeric features. For example, we can calculate the number of users a user is connected to or how many listing photos they share with other hosts. These metrics are straightforward to calculate and give us a basic sense of the node’s role in the overall structure of the graph. These metrics are valuable but do not leverage the node’s features. As such, simple statistics cannot go beyond representing graph structure.</p><p id="bc13" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">What we really want is to be able to produce an aggregation of a node’s neighborhood in the graph that captures both the node’s structural role in the graph and its node features. For example, we want to know more than how many users a user is connected to; we also want to understand the type of users they are connected to (e.g. their account tenure, or past booking counts) because that gives us more hints about the original user than simple edge counts.</p><h2 id="77f0" class="mv lz ja bn ma mw mx my me mz na nb mi ky nc nd mk lc ne nf mm lg ng nh mo ni gc"><strong class="ba">Graph Convolutional Networks</strong></h2><p id="6b2c" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">To capture both graph structure and node features, we can use a type of graph neural network architecture called a graph convolutional network. Graph convolutional networks (GCN) are neural networks that generally take as input a matrix of node features in addition to an adjacency matrix of the graph and outputs a node-level output. This type of network architecture is preferred over simply concatenating pre-computed structural features with node features because it is able to jointly represent the two types of information, likely producing richer embeddings.</p><p id="8bcb" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Graph convolutional networks consist of multiple layers. A single GCN layer aims to learn a representation of the node that aggregates information from its neighborhood (and in most cases, combines its neighborhood information with its own features). Using the example of our newly created host account, one GCN layer would be the host’s one-hop neighborhood. A second GCN layer representing the host’s two-hop neighborhood can be introduced to capture additional information. Since the output of GCN layer N is used to produce the representations used in GCN layer N+1, adding layers increases the span of the aggregation used to generate node representations [1].</p><p id="2c16" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Drawing from the example in the previous section, we would need a GCN with two layers in order to produce a graph embedding that captures the illustrated subgraph. We could go even deeper and expand further to third-order, fourth-order, and so on. In practice, however, a small number of layers (e.g. 2–4) is sufficient, as the connections beyond that point are likely to be very noisy and unlikely to be relevant to the original user.</p><h2 id="3bb0" class="mv lz ja bn ma mw mx my me mz na nb mi ky nc nd mk lc ne nf mm lg ng nh mo ni gc"><strong class="ba">Model architecture &amp; training</strong></h2><p id="05c6" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">Having decided to use GCNs, we must now consider how complex we want each layers’ method of aggregating neighboring nodes’ features to be. There are a wide variety of aggregation methods which can be used. These include mean pooling, sum pooling, as well as more complex aggregators involving attention mechanisms [5].</p><p id="1391" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">When it comes to trust &amp; safety, we often work in adversarial problem domains where frequent model retraining is required due to concept drift. Limiting model complexity, and limiting the number of models that must be retrained is important for reducing maintenance complexity.</p><p id="fbab" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">One might assume that GCNs with more complex, expressive aggregation functions are always better. This is not necessarily the case. In fact, several papers have shown that in many cases very simple graph convolutional networks are all that is needed for state-of-the-art performance in practical tasks [2, 3]. The Simplified GCN (SGC) architecture showed that we can achieve performance comparable to more complex aggregators using GCN layers that do not have trainable weights [2]. The Scalable Graph Inception Network (SIGN) architecture showed that, in general, we can precompute multiple aggregations without trainable weights, and use them in parallel as inputs into a downstream model [3]. SIGN and SGC are very related; SIGN provides a general framework for precomputing graph aggregations, and SGC provides the most straightforward aggregator to use within the SIGN framework.</p><figure class="ln lo lp lq gz lr gn go paragraph-image"><div role="button" tabindex="0" class="ls lt dq lu cf lv"><div class="gn go nl"><img alt="" class="cf lw lx" src="https://miro.medium.com/max/1400/1*5h9Tp1c8MkyEJAE5ZR2hBA.png" width="700" height="360" role="presentation" /></div></div></figure><p id="2621" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Using SIGN and SGC, the GCN is purely a fixed feature extractor that does not need to learn anything itself — it has no weights that must be tuned during training. In this setting we are able to fundamentally treat the GCN as a fixed mathematical formula applied to its inputs. This aspect is very convenient because we do not need to worry about supervised training or pre-training of the GCN itself.</p><h2 id="0819" class="mv lz ja bn ma mw mx my me mz na nb mi ky nc nd mk lc ne nf mm lg ng nh mo ni gc"><strong class="ba">Model serving</strong></h2><p id="b509" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">When serving a graph neural network, the main considerations are around freshness of the data and how to obtain the inputs for the model in a production setting. Our primary concern is the trade-offs related to data freshness. The decision between real-time or batch methods has an impact on how up to date the information is.</p><p id="91ff" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Real-time methods can provide downstream models with the most up-to-date information. This increased freshness does, however, require more effort to serve the embeddings. In addition, it often relies on a downsampled version of the graph to handle nodes with many edges, such as in the GraphSAGE algorithm [4].</p><p id="e52a" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Offline batch methods are able to calculate all node embeddings at once. This provides a distinct advantage over real-time methods by reducing implementation complexity. Unfortunately, the tradeoff does come at a cost. We will not necessarily be able to serve the most recent node embedding as we will only be able to leverage information present in the last run of the pipeline.</p><h1 id="2741" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc"><strong class="ba">Chosen solution</strong></h1><p id="514b" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">Given all the tradeoffs and our requirements, we ultimately decided to use a periodic offline pipeline which leverages the SIGN method for our initial implementation. The easy maintenance of batch pipelines and relative simplicity of SIGN allows us to optimize for learning instead of performance initially.</p><p id="0ce6" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Despite the fact that many of our trust &amp; safety models are run online in real-time, we decided to start with an offline graph model. Features are computed using a snapshot of the graph and node features. When fetching these features online, the downstream model simply looks up the previous run’s output from our feature store, rather than having to compute the embedding in real-time. The alternative of a real-time graph embedding solution would involve a significant amount of additional implementation complexity.</p><h1 id="fae5" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Benefits Realized</h1><p id="e62d" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">With the batch pipeline implemented, we can now have access to new information as features in downstream models. Our existing feature sets did not capture this information and it has resulted in significant gains in our models. Components of the embedding are often among the top 10 features in downstream models based on feature importance computed using <a class="au ll" href="https://github.com/slundberg/shap" rel="noopener ugc nofollow" target="_blank">the SHAP approach</a>.</p><p id="e1f2" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">These positive results encourage further investment in the area of graph embeddings and graph signals, and we plan to explore other types of graphs &amp; graph edges. Investigating how to make our embedding more powerful either through improving the freshness of the data or using other algorithms has become a priority for us based on the success of augmenting our existing models with graph knowledge.</p><h1 id="59d6" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Conclusion</h1><p id="b20f" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">In this blog post, we showed how leveraging graph information can be broadly useful and discussed our approach to implementing graph machine learning. We ultimately decided to use a SIGN architecture that leverages a batch pipeline to calculate graph embeddings. These are subsequently fed into downstream models as features. Many of the new features have led to notable performance gains in the downstream models.</p><p id="988a" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">We hope that this information helps others understand how to leverage graph information to improve their models. Our various considerations provide insight into what one must be aware of when deciding to implement a graph machine learning system.</p><p id="d710" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">Graph machine learning is an exciting area of research in Airbnb, and this is only the beginning. If this type of work interests you, check out some of our related positions:</p><p id="8b74" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc"><a class="au ll" href="https://careers.airbnb.com/positions/3910069/" rel="noopener ugc nofollow" target="_blank">Senior Machine Learning Engineer</a></p><p id="1d71" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc"><a class="au ll" href="https://careers.airbnb.com/positions/4113532/" rel="noopener ugc nofollow" target="_blank">Senior Software Engineer, Trust</a></p><h1 id="7a40" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">Acknowledgments</h1><p id="a02b" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">This project couldn’t have been done without the great work of many people and teams. We would like to thank:</p><ul class=""><li id="0572" class="nm nn ja kr b ks kt kv kw ky no lc np lg nq lk nr ns nt nu gc">Owen Sconzo for working on this project and reviewing all of the code.</li><li id="392b" class="nm nn ja kr b ks nv kv nw ky nx lc ny lg nz lk nr ns nt nu gc">The Trust Foundational Modeling team for providing the foundational data for graph modeling.</li><li id="062b" class="nm nn ja kr b ks nv kv nw ky nx lc ny lg nz lk nr ns nt nu gc">Members of the Fraud &amp; Abuse Working Group for supporting this project, reviewing this blog post, and providing suggestions.</li></ul><h1 id="30f5" class="ly lz ja bn ma mb mc md me mf mg mh mi kg mj kh mk kj ml kk mm km mn kn mo mp gc">References</h1><p id="12c9" class="pw-post-body-paragraph kp kq ja kr b ks mq kb ku kv mr ke kx ky ms la lb lc mt le lf lg mu li lj lk it gc">[1] <a class="au ll" href="https://arxiv.org/abs/1609.02907" rel="noopener ugc nofollow" target="_blank">Semi-Supervised Classification with Graph Convolutional Networks.</a></p><p id="1e59" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">[2] <a class="au ll" href="https://arxiv.org/abs/1902.07153" rel="noopener ugc nofollow" target="_blank">Simplifying Graph Convolutional Networks</a></p><p id="55c2" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">[3] <a class="au ll" href="https://arxiv.org/abs/2004.11198" rel="noopener ugc nofollow" target="_blank">SIGN: Scalable Inception Graph Neural Networks</a></p><p id="8dde" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">[4] <a class="au ll" href="https://arxiv.org/abs/1706.02216" rel="noopener ugc nofollow" target="_blank">Inductive Representation Learning on Large Graphs</a></p><p id="71a6" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">[5] <a class="au ll" href="https://arxiv.org/abs/1710.10903" rel="noopener ugc nofollow" target="_blank">Graph Attention Networks</a></p><p id="c9f4" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc">****************</p><p id="b21e" class="pw-post-body-paragraph kp kq ja kr b ks kt kb ku kv kw ke kx ky kz la lb lc ld le lf lg lh li lj lk it gc"><em class="oa">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/graph-machine-learning-at-airbnb-f868d65f36ee</link>
      <guid>https://medium.com/airbnb-engineering/graph-machine-learning-at-airbnb-f868d65f36ee</guid>
      <pubDate>Tue, 14 Jun 2022 17:59:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Unified Payments Data Read at Airbnb]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@alicangoksel?source=post_page-----e613e7af1a39--------------------------------"><div class="l dq"><img alt="Alican GÖKSEL" class="l ci fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/1*azaON2JpGQFP2bDmaPmLzg.png" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gc"><div class="hd o he"><div><div class="cj" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@alicangoksel?source=post_page-----e613e7af1a39--------------------------------">Alican GÖKSEL</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bn b bo bp co">Jun 9</p><div class="hw cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">10 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div><div class="cl ie"><div></div></div></div><div class="if ig ih j i d"><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><p id="6fe1" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">How we redesigned payments data read flow to optimize client integrations, while achieving up to 150x performance gains.</p><p id="b4de" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">By: <a class="au kv" href="https://www.linkedin.com/in/ali-can-g%C3%B6ksel-7189214a" rel="noopener ugc nofollow" target="_blank">Ali Goksel,</a> <a class="au kv" href="https://www.linkedin.com/in/yixiamao" rel="noopener ugc nofollow" target="_blank">Yixia Mao</a></p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*fWhQRxAT0vMQwTEW" width="700" height="467" role="presentation" /></div></div></figure><h1 id="3af1" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Introduction</h1><p id="8ced" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">In recent years, Airbnb migrated most of its backend services from a monolith to a service-oriented architecture (SOA). This industry standard architecture brings countless benefits to a company that is at the scale of Airbnb; however, it is not free of challenges. With data scattered across many services, it’s difficult to provide all the information clients need in a simple and performant way, especially for complex domains such as payments. As Airbnb grew, this problem started to crop up for many new initiatives such as host earnings, tax form generation, and payout notifications, all of which required data to be read from the payments system.</p><p id="0f1c" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In this blog post, we introduce Airbnb’s unified payments data read layer. This read layer was custom built to reduce the friction and complexity for client integrations, while greatly improving query performance and reliability. With this re-architecture, we were able to provide a greatly optimized experience to our host and guest communities, as well as for internal teams in the trust, compliance, and customer support domains.</p><h1 id="45d6" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Evolution of Airbnb’s Payments Platform</h1><p id="93ac" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Payments is one of the earliest functionalities of the Airbnb app. Since our co-founder Nate’s first commit, Payments Platform has grown and evolved tremendously, and it continues to evolve at an even faster pace given our expanding global presence.</p><p id="1a8f" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Similar to other companies, Airbnb started its journey with a monolithic application architecture. Since the feature set was initially limited, both write and read payment flows were “relatively” simple.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div class="gn go ml"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1200/0*dsBzT9rDVCNoG8MM" width="600" height="863" role="presentation" /></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">Overly simplified diagram of Airbnb’s old monolithic architecture. Payments schemas were not very complex, and the feature set was limited.</figcaption></figure><p id="fe24" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Predictably, this architecture couldn’t scale well with the rapid growth and expansion of our company. Payments, along with most other parts of the tech stack, started to migrate to the SOA architecture. This brought a significant overhaul of the existing architecture and provided many advantages, including:</p><ul class=""><li id="84a5" class="mp mq ja jz b ka kb ke kf ki mr km ms kq mt ku mu mv mw mx gc">We had clear boundaries between different services, which enabled better domain ownership and faster iterations.</li><li id="19b2" class="mp mq ja jz b ka my ke mz ki na km nb kq nc ku mu mv mw mx gc">Data was separated into domains in a very normalized shape, resulting in better correctness and consistency.</li></ul><p id="ceaf" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">For more, take a peek at our <a class="au kv" rel="noopener" href="https://medium.com/airbnb-engineering/rebuilding-payment-orchestration-at-airbnb-341d194a781b">blog post</a> detailing the payments SOA migration.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*6lC8Bu1u3DA0WAzV" width="700" height="147" role="presentation" /></div></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">After the SOA migration, every payments subdomain has its own service(s) and tables with clear boundaries, but more features leads to more complex and normalized data.</figcaption></figure><h1 id="95d0" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">New Architecture Introduces New Challenges</h1><p id="fea4" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Payments SOA provided us with a more resilient, scalable, and maintainable payments system. During this long and complex migration, correctness of the system was our top priority. Data was normalized and scattered across many payments domains according to each team’s responsibilities. This subdivision of labor had an important side effect: presentation layers now often needed to integrate with multiple payments services to fetch all the required data.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*itWZQ8kaxNjoZ-7f" width="700" height="287" role="presentation" /></div></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">How payments data read flows looked after the SOA migration. Presentation services called one or more payments services and aggregated data at the application layer.</figcaption></figure><p id="6cd3" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">At Airbnb, we believe in being transparent with our host and guest communities. Our surfaces related to payments and earnings display a range of details including fees, transaction dates, currencies, amounts, and total earnings. After the SOA migration, we needed to look into multiple services and read from even more tables than prior to the migration to get all the requested information. Naturally, this foundation brought challenges when we wanted to add new surfaces with payments data, or when we wanted to extend the existing surfaces to provide additional details. There were three main challenges that we needed to solve.</p><p id="1c75" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The first challenge was that <em class="nd">clients now needed to understand the payments domain well enough</em>to pick the correct services and APIs. For client engineers from other teams, this required a non-trivial amount of time investment and slowed down overall time to market. On the payments side, engineers needed to provide continuous consultation and guidance, occupying a significant portion of their work time.</p><p id="43da" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The second challenge was that there were many instances in which we had to change multiple payments APIs at the same time in order to meet the client requirements. When there are<em class="nd"> too many touchpoints</em>, it becomes <em class="nd">hard to prioritize requests</em> since many teams have to be involved. This problem also caused significant negative impact to time to market. We had to slow down or push back feature releases when the alignment and prioritization meetings did not go smoothly. Similarly, when payments teams had to update their APIs, they had to make sure that all presentation services adopted these changes, which slowed down progress on the payments system.</p><p id="6e3b" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Last but not least, the technical quality of the complex read flows was not where we wanted it to be. Application-level aggregations worked fine for the average use case, but we had space for improvement when it came to our large hosts and especially for our prohosts, who might have thousands of yearly bookings on our platform. To have confidence in our system over the long term, we needed to find a solution that provided inherently better <strong class="jz jb"><em class="nd">performance, reliability, and scalability</em>.</strong></p><h1 id="b7d1" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Introducing the Payments Unified Data Read Layer</h1><p id="a2d7" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">To achieve our ambitious goals for payments, we needed to re-think how clients integrate with our payments platform.</p><h1 id="d8af" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Unified Entry Points</h1><p id="bf61" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Our first task was to unify the payments data read entry points. To accomplish this, we leveraged <a class="au kv" rel="noopener" href="https://medium.com/airbnb-engineering/taming-service-oriented-architecture-using-a-data-oriented-service-mesh-da771a841344">Viaduct</a>, Airbnb’s data-oriented service mesh, where clients query for the “entity” instead of needing to identify dozens of services and their APIs. This new architecture required our clients to worry only about the requisite data entity rather than having to communicate with individual payments services.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*rtNAGGJFpwHfnZ5V" width="700" height="337" role="presentation" /></div></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">Instead of communicating with individual payments services, presentation services just use the read layer.</figcaption></figure><p id="4a64" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In these entry points, we provided as many filtering options as possible so each API could hide filtering and aggregation complexity from its clients. This also greatly reduced the numbers of APIs we needed to expose.</p><h1 id="8474" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Unified Higher-Level Data Entities</h1><p id="5527" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Having a single entry point is a good start, but it does not resolve all the complexity. In payments, we have 100+ data models, and it requires a decent amount of domain knowledge to understand their responsibilities clearly. If we just expose all of these models from a single entry point, there would still be too much context required for client engineers.</p><p id="1616" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Instead of making our clients deal with this complexity, we opted to hide payments internal details as much as possible by coming up with <strong class="jz jb">higher-level domain entities</strong>. Through this process, we were able to reduce the core payments data to fewer than tenhigh level entities, which greatly reduced the amount of exposed payments internal details. These new entities also allowed us to guard clients against changes made in Payments platform. When we internally update the business logic, we keep the entity schema the same without requiring any migrations on the client side. Our principles for the new architecture were the following:</p><ul class=""><li id="8f4d" class="mp mq ja jz b ka kb ke kf ki mr km ms kq mt ku mu mv mw mx gc"><strong class="jz jb">Simple</strong>: Design for non-payments engineers, and use common terminology.</li><li id="051e" class="mp mq ja jz b ka my ke mz ki na km nb kq nc ku mu mv mw mx gc"><strong class="jz jb">Extensible</strong>: Maintain loose coupling with storage schema, and encapsulate concepts to protect from payments internal changes while allowing quick iterations.</li><li id="4bd1" class="mp mq ja jz b ka my ke mz ki na km nb kq nc ku mu mv mw mx gc"><strong class="jz jb">Rich</strong>: Hide away the complexity but not the data. If clients need to fetch data, they should be able to find it in <strong class="jz jb"><em class="nd">one</em></strong> of the entities.</li></ul><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*hPMYFXmawEXCM8tf" width="700" height="375" role="presentation" /></div></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">Expose cleaner higher-level domain entities to hide payments internal details while guarding clients from frequent API migrations.</figcaption></figure><h1 id="849d" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Materialize Denormalized Data</h1><p id="ce3c" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">With unified entry points and entities, we greatly reduced the complexity for client onboardings. However, the “<strong class="jz jb"><em class="nd">how</em></strong>” of fetching the data, combined with expensive application layer aggregations, was still a big challenge. While it’s important that clients are able to integrate with the payments system smoothly, our valued community should also enjoy the experience on our platform.</p><p id="0f55" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The core problem we identified was <strong class="jz jb"><em class="nd">dependency on many tables and services during client queries</em></strong>. One of the promising solutions was denormalization–essentially, moving these expensive operations from query time to ingestion time. We explored different ways of pre-denormalizing payments data and materializing it reliably with less than 10 seconds replication lag. To our great luck, our friends in the Homes Foundation team were piloting a Read-Optimized Store Framework, which takes an event-driven lambda approach to materializing secondary indices. Using this framework, teams are able to get both near real-time data via database change capture mechanisms and historical data leveraging our daily database dumps stored in Hive. In addition, the maintenance requirements of this framework (e.g., single code for online and offline ingestion, written in Java) were much lower compared to other existing internal solutions..</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*aEEGisG92n5f9mc8" width="700" height="320" role="presentation" /></div></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">A high-level look at the read-optimized store framework usage by payments. It provides ingestion flows for both offline and near real-time data with shared business logic between them.</figcaption></figure><p id="c6ed" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">After combining all of above improvements, our new payments read flow looked like the following:</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*kWOXKjHjm7Jvskze" width="700" height="399" role="presentation" /></div></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">The final shape of the payments data read architecture. Clients do not need to know any payments services or internals.</figcaption></figure><p id="ebad" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">We provide data in a reliable and performant way via denormalized read-optimized store indices.</p><h1 id="642e" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Results</h1><h1 id="92b6" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Migrate and Elevate: Transaction History</h1><p id="74a9" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">The first test surface for the new unified data read architecture was Transaction History (TH). Hosts on our platform use the <a class="au kv" href="https://www.airbnb.com/users/transaction_history" rel="noopener ugc nofollow" target="_blank">Transaction History page</a> to view their past and future payouts along with top-level earning metrics (e.g., total paid out amount).</p><p id="23ec" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">On the technical side, this was one of the most complex payments flows we had. There were many different details required, and the data was coming from <strong class="jz jb">10+</strong> payments tables. This had caused issues in the past, including timeouts, slow loading times, downtime due to hard dependencies, and slow iteration speed as a result of complex implementations. While doing the initial technical design for TH migration from Airbnb monolith to SOA, we took the hard path of re-architecting this flow instead of applying band-aids. This helped to ensure long-term success and provide the best possible experience to our host community.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go ne"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*homIPsg24j6ZA1FT" width="700" height="717" role="presentation" /></div></div></figure><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*8SKW7SUEmrbGMFO1" width="700" height="380" role="presentation" /></div></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">Transaction History page and simplified high level architecture. Airbnb monolith app behaves like a presentation service and fetches data from multiple payment services and also from legacy databases.</figcaption></figure><p id="af20" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">This use case was a great fit for our unified read layer. Using the data used by TH as a starting point, we came up with a new API and high-level entity to serve all data read use cases from similar domains.</p><p id="f1b9" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">After locking down the entity and its schema, we started to denormalize the data. Thanks to the read-optimized store framework, we were able to denormalize all the data from 10+ tables into a couple of Elasticsearch indices. Not only did we greatly reduce the touchpoints of the query, we were also able to paginate and aggregate much more efficiently by leveraging the storage layer instead of doing the same operations on the application layer. After close to two years of work, we migrated 100% of traffic and achieved up to <strong class="jz jb"><em class="nd">150x</em></strong> latency improvements, while improving the reliability of the flow from ~96% to <strong class="jz jb"><em class="nd">99.9+%</em></strong><em class="nd">.</em></p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*NvD6A6Q0AswwXib-" width="700" height="667" role="presentation" /></div></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">After the re-architecture, payments data needed by Transaction History is provided by payments read-optimized store and accessed by clients using a well-defined and extensible payout schema over the unified data read layer.</figcaption></figure><h1 id="bb65" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Unlocking New Experiences: Guest Payment History</h1><p id="90d9" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Our next use case, called Guest Payment History, came out of Airbnb’s annual company-wide hackathon. This hackathon project aimed to provide a detailed and easy way for our guest community to track their payments and refunds. Similar to Transaction History, this scenario also required information from multiple payments services and databases, including many legacy databases.</p><p id="7407" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Guest Payment History (GPH) also helped to showcase many benefits brought by the unified read layer: a new unified entity to serve GPH and future similar use cases, along with an extensible API which supported many different filters. We denormalized and stored data from legacy and SOA payment tables using the read-optimized store framework into a single Elasticsearch index, which reduced the complexity and cost of queries greatly.</p><p id="8b70" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">We released this new page to our community with our <a class="au kv" href="https://news.airbnb.com/2021-winter-release/" rel="noopener ugc nofollow" target="_blank">2021 Winter launch</a> and achieved a huge reduction on customer support tickets related to questions about guest payments; which resulted in close to $1.5M cost savings for 2021. It also illustrated our move towards a stronger technical foundation with high reliability and low latency.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*KNPtvQ1odi8dp2Xw" width="700" height="700" role="presentation" /></div></div><figcaption class="mm bm gp gn go mn mo bn b bo bp co">Guests can track their payments and refunds using Guest Payment History.</figcaption></figure><p id="a9e1" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The architecture is very similar to TH, where data is provided to clients via unified API and schema, backed by a secondary store.</p><p id="f593" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">After exposing these new entities via TH and GPH, we started to onboard many other critical use cases to leverage the same flow in order to efficiently serve and surface payments data.</p><h1 id="160a" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Conclusion</h1><p id="9fd7" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Microservice/SOA architectures greatly help backend teams to independently scale and develop various domains with minimal impact to each other. It’s equally important to make sure the clients of these services and their data will not be subject to additional challenges under this new industry-standard architecture.</p><p id="8da0" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In this blog post, we illustrated some potential solutions, such as unified APIs and higher-level entities to hide away the internal service and architectural complexities from the callers. We also recommend leveraging denormalized secondary data stores to perform expensive join and transformation operations during ingestion time to ensure client queries can stay simple and performant. As we demonstrated with multiple initiatives, complex domains such as payments can significantly benefit from these approaches.</p><p id="9964" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">If this type of work interests you, take a look at the following related positions:</p><p id="4e93" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><strong class="jz jb">US</strong>:</p><p id="0a91" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/3393185/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Payments</a></p><p id="c636" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><strong class="jz jb">India</strong>:</p><p id="2f01" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/3153981/" rel="noopener ugc nofollow" target="_blank">Senior Software Engineer, Cities Bangalore</a></p><p id="6cb9" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/3842855/" rel="noopener ugc nofollow" target="_blank">Engineering Manager, Ambassador Platform Products</a></p><p id="beb1" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/2768475/" rel="noopener ugc nofollow" target="_blank">Manager, Engineering Payments Compliance</a></p><p id="92f5" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/2773515/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Payments Compliance</a></p><p id="73ab" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/2925359/" rel="noopener ugc nofollow" target="_blank">Senior Software Engineer, Payments Compliance</a></p><h1 id="7fe9" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Acknowledgments</h1><p id="0d69" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">We had many people at Airbnb contributing to this big re-architecture, but countless thanks to Mini Atwal, Yong Rhyu, Musaab At-Taras, Michel Weksler, Linmin Yang, Linglong Zhu, Yixiao Peng, Bo Shi, Huayan Sun, Wentao Qi, Adam Wang, Erika Stott, Will Koh, Ethan Schaffer, Khurram Khan, David Monti, Colleen Graneto, Lukasz Mrowka, Bernardo Alvarez, Blazej Adamczyk, Dawid Czech, Marcin Radecki, Tomasz Laskarzewski, Jessica Tai, Krish Chainani, Victor Chen, Will Moss, Zheng Liu, Eva Feng, Justin Dragos, Ran Liu, Yanwei Bai, Shannon Pawloski, Jerroid Marks, Yi He, Hang Yuan, Xuemei Bao, Wenguo Liu, Serena Li, Theresa Johnson, Yanbo Bai, Ruize Lu, Dechuan Xu, Sam Tang, Chiao-Yu Tuan, Xiaochen He, Gautam Prajapati, Yash Gulani, Abdul Shakir, Uphar Goyal, Fanchen Kong, Claire Thompson, Pavel Lahutski, Patrick Connors, Ben Bowler, Gabriel Siqueira, Jing Hao, Manish Singhal, Sushu Zhang, Jingyi Ni, Yi Lang Mok, Abhinav Saini, and Ajmal Pullambi. We couldn’t have accomplished this without your invaluable contributions.</p><h1 id="f5f1" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">****************</h1><p id="8d19" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc"><em class="nd">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/unified-payments-data-read-at-airbnb-e613e7af1a39</link>
      <guid>https://medium.com/airbnb-engineering/unified-payments-data-read-at-airbnb-e613e7af1a39</guid>
      <pubDate>Fri, 10 Jun 2022 00:23:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Faster JavaScript Builds with Metro]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@raejin?source=post_page-----cfc46d617a1f--------------------------------"><div class="l dq"><img alt="Rae Liu" class="l ci fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/1*uRsBOW8RAAsSZMrQjFpPMg.png" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gc"><div class="hd o he"><div><div class="cj" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@raejin?source=post_page-----cfc46d617a1f--------------------------------">Rae Liu</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bn b bo bp co">May 24</p><div class="hw cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">11 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div></div><div class="cl ie"><div></div></div></div><div class="if ig ih j i d"><div class="ii l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><p id="d786" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><em class="kv">How Airbnb migrated from Webpack to Metro and made the development feedback loop nearly instantaneous, the largest production build 50% faster, with marginal end-user runtime improvements.</em></p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*RZFWkaoezUfVzTxvpfm2gQ.jpeg" width="700" height="467" role="presentation" /></div></div></figure><p id="5b5e" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><strong class="jz jb">By:</strong> <a class="au li" href="https://www.linkedin.com/in/raejin/" rel="noopener ugc nofollow" target="_blank">Rae Liu</a></p><h1 id="6e9e" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Introduction</h1><p id="53df" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">In 2018, the frontend Airbnb infrastructure relied on Webpack for JavaScript bundling which had served us well up until then; however, with our codebase almost having quadrupled in the previous year, the frontend team was noticing a significant impact on the development experience. Not only was build performance slow, but the average page refresh time for a trivial one-line code change was anywhere between 30 seconds and 2 minutes depending on the project size. In order to mitigate this, the team decided to migrate to <a class="au li" href="https://facebook.github.io/metro/" rel="noopener ugc nofollow" target="_blank">Metro</a>.</p><p id="ecd9" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Thanks to the switch to Metro, we’ve improved our build performance. In development, the time it takes for a simple UI change to be reflected and loaded (<a class="au li" href="https://developer.mozilla.org/en-US/docs/Glossary/Time_to_interactive" rel="noopener ugc nofollow" target="_blank">Time to Interactive TTI metric</a>) is <strong class="jz jb">80% faster</strong>. The slowest production build compiling ~49k modules (JavaScript files) is <strong class="jz jb">55% faster</strong> (down from 30.5 minutes to 13.8 minutes). As an added bonus, we’ve observed improvements in the <a class="au li" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">Airbnb Page Performance Scores</a> by ~<strong class="jz jb">1%</strong> for pages built with Metro.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go mm"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*MlrQ2yEBbHj0OOHc_sYxtg.png" width="700" height="189" role="presentation" /></div></div></figure><p id="30d1" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Scaling issues with JavaScript bundlers certainly isn’t a unique problem to Airbnb. In this blog post, we want to highlight the key architectural differences between Webpack and Metro as well as some of the migration challenges we faced in both development and production builds. If you anticipate one of your own projects to scale up significantly in the future, we hope this post can provide useful insights on solving this problem.</p><h1 id="8b9d" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">What is Metro?</h1><p id="4583" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc"><a class="au li" href="https://facebook.github.io/metro/" rel="noopener ugc nofollow" target="_blank">Metro</a> is the open source JavaScript bundler for React Native. While <a class="au li" rel="noopener" href="https://medium.com/airbnb-engineering/sunsetting-react-native-1868ba28e30a">Airbnb no longer uses React Native</a>, we believed the infrastructure could be leveraged for the web as well. After numerous consultations with the Metro folks at Meta as well as some of our own modifications, we managed to build a flavor of Metro that now powers both development and production bundling for all Airbnb websites.</p><p id="a6fd" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Conceptually, Metro breaks down bundling to three steps in the following order: <a class="au li" href="https://facebook.github.io/metro/docs/concepts#resolution" rel="noopener ugc nofollow" target="_blank">resolution</a>, <a class="au li" href="https://facebook.github.io/metro/docs/concepts#transformation" rel="noopener ugc nofollow" target="_blank">transformation</a> and <a class="au li" href="http://serialization" rel="noopener ugc nofollow" target="_blank">serialization</a>.</p><ul class=""><li id="5356" class="mn mo ja jz b ka kb ke kf ki mp km mq kq mr ku ms mt mu mv gc">Resolution deals with how to resolve import/require statements.</li><li id="29be" class="mn mo ja jz b ka mw ke mx ki my km mz kq na ku ms mt mu mv gc">Transformation is responsible for transpiling code (source-to-source compiler which converts modern TypeScript/JavaScript source code into functionally equivalent JavaScript code that’s more optimized and backwards compatible with older browsers), an example tool would be <a class="au li" href="https://babeljs.io/" rel="noopener ugc nofollow" target="_blank">babel</a>.</li><li id="6d2a" class="mn mo ja jz b ka mw ke mx ki my km mz kq na ku ms mt mu mv gc">Serialization combines the transformed files into JavaScript bundles.</li></ul><p id="c9dd" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">These three concepts are the fundamental building blocks to understand how Metro works. In the following sections, we highlight the key architectural differences between Metro and Webpack to provide deeper context into Metro’s strengths.</p><h1 id="1368" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Key architectural differences between Metro and Webpack</h1><h1 id="754c" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Process JS bundles on demand in development</h1><p id="8776" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">When we talk about bundles, a JavaScript bundle is technically just a serialized dependency graph, where an entry point is the root of the graph. At Airbnb, a web page maps to a single entry point. In development, Webpack (even the latest v5 version) requires knowing <a class="au li" href="https://webpack.js.org/concepts/entry-points/" rel="noopener ugc nofollow" target="_blank">the entry points</a> for all pages before it can start bundling. On the other hand, the Metro development server processes the requested JavaScript bundles on the fly.</p><p id="693f" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">More specifically, at Airbnb, every frontend project has a Node server which matches a route to a specific entry point. When a web page is requested, the DOM includes script tags with the development JavaScript URLs. The browser loads the page, and makes requests to the Metro development server for the JavaScript bundles. In Figure 1, we illustrate the difference between our Metro &amp; Webpack development setup:</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go nb"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*d0S7RQA6IXt1YqAO" width="700" height="485" role="presentation" /></div></div></figure><p id="09f3" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Figure 1: Differences between the JS bundle development setups for Metro and Webpack</p><p id="e31a" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In this example, there is a web project with three entry points: entryPageA.js, entryPageB.js, and entryPageC.js. A developer makes changes to Page A, which includes only the entryPageA.js bundle. As you can see in Figure 1, in both scenarios, the browser loads Page A (1), then requests the entryPageA.js file from the bundler (2), and finally the bundler responds to the browser with the appropriate bundles (4). With the Webpack bundler (1a), even though the browser only requests entryPageA.js, Webpack compiles all entry points on start-up before it can respond to the entryPageA.js request from the browser. On the other hand, with the Metro bundler (1b), we see that the development server does not spend any time compiling entryPageB.js or entryPageC.js, instead only compiling entryPageA.js before responding to the browser request.</p><p id="c794" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">One of the biggest frontend projects at Airbnb has ~26k unique modules, with the median number of modules per page being ~7.2k modules. Because we also do server side rendering, the number of modules we ultimately have to process doubles to roughly ~48k. With Metro’s development model, we saved ~70% of work by compiling JavaScript on demand.</p><p id="cb77" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">This key architectural difference improves the developer experience, as Metro only compiles what is needed (JavaScript bundles on the pages requested), whereas Webpack pre-compiles the entire project on start-up.</p><h1 id="bcda" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Multi-layered cache</h1><p id="dc67" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">Another powerful Metro feature we leverage is its <a class="au li" href="https://facebook.github.io/metro/docs/caching" rel="noopener ugc nofollow" target="_blank">multi-layered caching</a> feature, which makes setting up both persistent and non-persistent caches straightforward. While Webpack 5 also comes with <a class="au li" href="https://webpack.js.org/guides/build-performance/#persistent-cache" rel="noopener ugc nofollow" target="_blank">a disk persistent cache</a>, it isn’t as flexible as Metro’s multi-layered cache. Webpack offers two <a class="au li" href="https://webpack.js.org/configuration/cache/#cachetype" rel="noopener ugc nofollow" target="_blank">distinct cache types</a>: “filesystem” or “memory”, which is limited to memory or disk cache, no remote cache capability is possible. In comparison, Metro provides more flexibility by allowing us to define the cache implementation, including mixing different types of cache layers. If a layer has a cache miss, Metro attempts to retrieve the cache from the next layer and so on.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div class="gn go nc"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1306/0*bvgMUBI6wm9xe0fZ" width="653" height="350" role="presentation" /></div></figure><p id="227e" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Figure 2: How Airbnb configures the multi-cache layers with Metro</p><p id="f16c" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The ordering of the caches determines the cache priority. When retrieving a cache, the first cache layer with a result will be used. In the setup illustrated in Figure 2, the fastest in-memory cache layer is prioritized at the top, followed by the file/disk cache, and lastly the remote read-only cache. Compared with the default Metro implementation without a cache, hitting a remote read-only cache resulted in a 56% faster server build in a project compiling 22k files.</p><p id="d53d" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">One contributing factor to Metro’s performance is its built-in worker support which amplifies the effect of the multi-layer cache. While Webpack requires careful configuration to leverage workers via a <a class="au li" href="https://webpack.js.org/loaders/thread-loader/" rel="noopener ugc nofollow" target="_blank">third-party plugin</a>, Metro by default spins up workers to offload expensive transforms, allowing for increased parallelization without configuration.</p><p id="dbae" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">But why use a remote read-only cache instead of a regular remote cache (read &amp; write)? We discovered that not writing to the remote cache saved an additional 17%build time in development for the same project with 22k files. Writing to the remote cache incurs network calls that can be costly, especially on a slower network. To populate the cache, instead of remote cache writes, we introduced a CI job that runs periodically on the default branch commit.</p><h1 id="972b" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Serialization</h1><p id="3b9f" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">In the bundler context, serialization means combining the transformed source files into one or multiple bundles. In Webpack, the concept of serialization is encapsulated in <a class="au li" href="https://webpack.js.org/api/compilation-hooks/#root" rel="noopener ugc nofollow" target="_blank">the compilation hooks</a> (Webpack’s public APIs). In Metro, a serializer function is responsible for combining source files into bundles.</p><p id="74bc" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">For one example of the importance of serialization, let’s take a look at Internationalization support. We currently support Airbnb websites in around 70 locales, and in 2020, our<a class="au li" rel="noopener" href="https://medium.com/airbnb-engineering/building-airbnbs-internationalization-platform-45cf0104b63c"> internationalization platform</a> served more than 1 million pieces of content. To support internationalization with JS bundles, we need to implement specific logic in the serialization step. Although we had to implement similar internationalization logic when serializing bundles for both Metro and Webpack, Webpack required lots of source code reading to find the appropriate compilation hooks for us to implement the support. On top of all that, it also required understanding the intricacies of concepts like what dependency templates are and how to write our own. Comparatively, it is a breath of fresh air to implement the same internationalization support with Metro. We only have to focus on how to serialize JS bundles with translation content and all the tasks are done in the single serializer function. The simplicity of Metro’s bundling concepts makes implementing any bespoke feature straightforward.</p><h1 id="c952" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Challenges of Adopting Metro at Airbnb</h1><p id="d0a8" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">Even though Metro has the architectural advantages described above, it also brought challenges to overcome in order to leverage it fully for the web. Because Metro is designed for use in a React Native environment, we needed to write <em class="kv">more code</em> to achieve feature parity with Webpack, so the decision to switch to Metro came at the expense of reinventing some wheels and learning the inner working of a JavaScript bundler that is usually abstracted away from us.</p><p id="1812" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In development, we had to create a Metro server with custom endpoints to handle building dependency graphs, translation, bundling JS &amp; CSS files, and building source maps. For production builds, we ran Metro as a Node API to handle resolution, transformation, and serialization.</p><p id="b0d9" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The surface area of the full migration was substantial, so we broke it down into two phases. Because the slow iteration speed of our Webpack setup incurred significant costs around developer productivity, we addressed the slow Webpack development experience with the Metro development server as our first priority. In the second phase, we brought Metro to feature parity with Webpack and ran an A/B test between Metro and Webpack in production. The two biggest challenges we faced along the way are outlined below.</p><h1 id="c32f" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Bundle Splitting</h1><p id="3d7c" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">The out-of-the-box Metro setup for development produced giant ~5MiB bundles per entry point, since a single bundle is the intended use case for React Native. For the Web, this bundle size was taxing on browser resources and network latency. Every code change resulted in a 5MiB bundle being processed and downloaded, which was inefficient and could not be HTTP-cached. Even if the changed code recompiled instantly, we still needed to reduce the size and improve browser cacheability.</p><p id="dedf" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">To improve the performance of Metro in the Web environment, we split the bundles by <a class="au li" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import#dynamic_import" rel="noopener ugc nofollow" target="_blank">dynamic import</a> boundaries, a technique also known as <a class="au li" href="https://developer.mozilla.org/en-US/docs/Glossary/Code_splitting" rel="noopener ugc nofollow" target="_blank">code splitting</a>. The code splitting boundaries enabled us to leverage HTTP caching effectively.</p><p id="e1bd" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In Figure 3, import(‘./file’) represents the dynamic import boundaries. The bundle on the left hand side (3a) is broken down to three smaller bundles on the right (3b). The additional bundles are requested when the import(‘./file’) statements are executed.</p><p id="c84a" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In Figure 3a, suppose fileA.js has changed, the entire bundle needs to be re-downloaded for the browser to pick up the change in fileA.js. With bundles split by dynamic import illustrated in Figure 3b, a change in fileA.js only results in re-downloading of the fileA.js bundle. The rest of the bundles can reuse browser cache.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go nd"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*R2QCSRzc7ysunr3_" width="700" height="357" role="presentation" /></div></div></figure><p id="adf5" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Figure 3: Splitting bundles by dynamic import boundaries. A bundle is represented by the rectangular boxes with a pink background.</p><p id="1a32" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">When we began to think about production bundles, we wanted to optimize a bit differently than in development. It takes time to run the bundle splitting algorithm, and we didn’t want to waste time on optimizing bundle sizes in development. Instead, we prioritized the page load performance over minimizing bundle sizes.</p><p id="b33b" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In production, we wanted to ship fewer and smaller JavaScript bundles to the end user so the page loads faster and the user experience is performant. There is no Metro development server in production, so all the bundles are pre-built. This makes bundle splitting the biggest blocking feature needed to make our Metro build production ready. With some inspiration from Webpack’s bundle splitting algorithm, we implemented a similar mechanism to split the Metro dependency graphs. The resulting bundle sizes decreased by ~20% (1549 KB –&gt; 1226 KB) on airbnb.com as compared to the development splitting by dynamic import boundaries.</p><p id="1aea" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">On comparing the bundle splitting results between Metro and Webpack’s implementations, we realized both provided bundles of comparable sizes with a few pages shipping a slightly higher number of Javascript bundles with Metro. Despite the slightly heavier page weight, <a class="au li" href="https://developer.mozilla.org/en-US/docs/Glossary/First_contentful_paint" rel="noopener ugc nofollow" target="_blank">TTFCP</a>, <a class="au li" href="https://developer.mozilla.org/en-US/docs/Web/API/Largest_Contentful_Paint_API" rel="noopener ugc nofollow" target="_blank">largest contentful paint</a>, and <a class="au li" href="https://web.dev/lighthouse-total-blocking-time/" rel="noopener ugc nofollow" target="_blank">Total Blocking Time</a> metrics are comparable between Metro and Webpack.</p><h1 id="c601" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Tree-shaking</h1><p id="f469" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">Bundle splitting alone decreased bundle sizes significantly, however we were able to make bundles even smaller by deleting dead code. However, it is not always obvious to identify what is considered dead code in a project, as some “dead code” in a project may be “used code” in the other projects. This is where tree-shaking came into play. It relied on the consistent usages of <a class="au li" href="https://tc39.es/ecma262/#sec-modules" rel="noopener ugc nofollow" target="_blank">ECMAScript modules</a> (ESM) <a class="au li" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import" rel="noopener ugc nofollow" target="_blank">import</a>/<a class="au li" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/export" rel="noopener ugc nofollow" target="_blank">export</a> statements in the code base. Based on the import/export usages in a project, we analyzed what specific export statements were not imported by any file in the project. Finally, the bundler removes the unused export statements, making the overall bundle sizes smaller.</p><p id="da97" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">One challenge we faced while implementing the tree-shaking algorithm for Metro production builds was the risk of mistakenly removing code that is executed at runtime. For example, we ran into multiple bugs related to <a class="au li" href="https://developer.mozilla.org/en-US/docs/web/javascript/reference/statements/export#re-exporting_aggregating" rel="noopener ugc nofollow" target="_blank">re-export statements</a>. Since Webpack handles ESM import/export statements in a different way, there was no comparable prior art for reference. After multiple iterations of tree-shaking algorithm implementation, the following table captures how much dead code we were finally able to drop given the project size.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go ne"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*WlsTTqWIGeJzk_ccUzFuBw.png" width="700" height="121" role="presentation" /></div></div></figure><h1 id="4248" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Conclusion</h1><p id="2629" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">The Metro migration brought forth some very significant improvements. The biggest Airbnb frontend project compiling ~48k modules (including server and browser compilations) saw a drop in the average build time by ~55% from 30.5 minutes to 13.8 minutes. Additionally, we saw improvements on the <a class="au li" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">Airbnb Page Performance Scores</a> with the pages built by Metro, ranging around +1%. The end user performance improvement was a nice surprise, as we initially aimed for achieving neutral experiment results.</p><p id="24ba" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The simplicity of Metro’s architecture has benefited us in many ways. Engineers from other teams have ramped up quickly to contribute to Airbnb’s Metro implementation, which means there is a lower barrier to entry for contributing to the bundling system. The <a class="au li" href="https://facebook.github.io/metro/docs/caching" rel="noopener ugc nofollow" target="_blank">multi-layered cache system</a> is straightforward to work with, making experimentation with caching possible. The bespoke bundler feature integrations are made obvious and easier to implement.</p><p id="7ccf" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">We acknowledge that the landscape has changed since we evaluated <a class="au li" href="https://parceljs.org/" rel="noopener ugc nofollow" target="_blank">Parcel</a>, <a class="au li" href="https://webpack.js.org/" rel="noopener ugc nofollow" target="_blank">Webpack 4</a>, and <a class="au li" href="https://facebook.github.io/metro/" rel="noopener ugc nofollow" target="_blank">Metro</a> back in 2018. There are other tools, such as <a class="au li" href="https://rollupjs.org/guide/en/" rel="noopener ugc nofollow" target="_blank">rollup.js</a> and <a class="au li" href="https://esbuild.github.io/" rel="noopener ugc nofollow" target="_blank">esbuild</a>, that we haven’t explored much, and we know that Metro isn’t a general-purpose JavaScript bundler when compared to Webpack. However, after a few years of working on Metro feature parity, the results we have seen have proven to us that it was a good decision to pursue Metro. Metro <strong class="jz jb">solved</strong> our most desperate scaling issues by dropping development and production build times. We are more productive than ever with instantaneous development feedback loops and faster production builds. If you would like to help us continue to improve our JavaScript tooling and build optimization, or tackle other web infrastructure challenges, check out these open roles at Airbnb:</p><p id="6393" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au li" href="https://careers.airbnb.com/positions/3903900/?gh_src=61d6ab411us" rel="noopener ugc nofollow" target="_blank">Senior Frontend Infrastructure Engineer, Web Platform</a></p><p id="8375" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au li" href="https://careers.airbnb.com/positions/3903900/?gh_src=61d6ab411us" rel="noopener ugc nofollow" target="_blank">Engineering Manager, Infrastructure</a></p><p id="65b9" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au li" href="https://careers.airbnb.com/positions/2623004/" rel="noopener ugc nofollow" target="_blank">Senior Software Engineer, Cloud Infrastructure</a></p><p id="4dac" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au li" href="https://careers.airbnb.com/positions/4168852/" rel="noopener ugc nofollow" target="_blank">Senior/Staff Software Engineer, Observability</a></p><h1 id="7e86" class="lj lk ja bn ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Acknowledgments</h1><p id="69a1" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">Thank you everyone who has contributed to this multi-year project. We couldn’t have done it without any of you! Special shoutout to my lovely team <a class="au li" href="mailto:michael.james@airbnb.com" rel="noopener ugc nofollow" target="_blank">Michael James</a> and <a class="au li" href="mailto:noah.sugarman@airbnb.com" rel="noopener ugc nofollow" target="_blank">Noah Sugarman</a> for driving the Metro production migration to the finish line. Thank you <a class="au li" href="mailto:breanna.bunge@airbnb.com" rel="noopener ugc nofollow" target="_blank">Brie Bunge</a>, <a class="au li" href="mailto:dan.beam@airbnb.com" rel="noopener ugc nofollow" target="_blank">Dan Beam</a>, <a class="au li" href="mailto:ian.myers@airbnb.com" rel="noopener ugc nofollow" target="_blank">Ian Myers</a>, <a class="au li" href="mailto:ian.remmel@airbnb.com" rel="noopener ugc nofollow" target="_blank">Ian Remmel</a>, <a class="au li" href="mailto:joe.lencioni@airbnb.com" rel="noopener ugc nofollow" target="_blank">Joe Lencioni</a>, <a class="au li" href="mailto:madison.capps@airbnb.com" rel="noopener ugc nofollow" target="_blank">Madison Capps</a>, <a class="au li" href="mailto:michael.james@airbnb.com" rel="noopener ugc nofollow" target="_blank">Michael James</a>, <a class="au li" href="mailto:noah.sugarman@airbnb.com" rel="noopener ugc nofollow" target="_blank">Noah Sugarman</a> for reviewing and giving great feedback on this blog post.</p><p id="9ae1" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><em class="kv">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/faster-javascript-builds-with-metro-cfc46d617a1f</link>
      <guid>https://medium.com/airbnb-engineering/faster-javascript-builds-with-metro-cfc46d617a1f</guid>
      <pubDate>Tue, 24 May 2022 19:39:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Dynamic Kubernetes Cluster Scaling at Airbnb]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@drmorr-airbnb?source=post_page-----d79ae3afa132--------------------------------"><div class="l dq"><img alt="David Morrison" class="l ci fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/0*e-69iB7-RczIzZUB" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gc"><div class="hd o he"><div><div class="cj" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@drmorr-airbnb?source=post_page-----d79ae3afa132--------------------------------">David Morrison</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bn b bo bp co">May 23</p><div class="hw cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">10 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div></div><div class="cl ie"><div></div></div></div><div class="if ig ih j i d"><div class="ii l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><p id="c0a3" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Authors: <a class="au kv" href="https://www.linkedin.com/in/evansheng112/" rel="noopener ugc nofollow" target="_blank">Evan Sheng</a>, <a class="au kv" href="https://www.linkedin.com/in/david-morrison-9419b110/" rel="noopener ugc nofollow" target="_blank">David Morrison</a></p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go kw"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/1*Elojmgc7Y06tItOaLdB0Cw.jpeg" width="700" height="467" role="presentation" /></div></div></figure><h1 id="b575" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Introduction</h1><p id="7b38" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">An important part of running Airbnb’s infrastructure is ensuring our cloud spending automatically scales with demand, both up <strong class="jz jb">and </strong>down. Our traffic fluctuates heavily every day, and our cloud footprint should scale dynamically to support this.</p><p id="1e28" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">To support this scaling, Airbnb utilizes Kubernetes, an open source container orchestration system. We also utilize OneTouch, a service configuration interface built on top of Kubernetes, and is described in more detail in a previous <a class="au kv" rel="noopener" href="https://medium.com/airbnb-engineering/a-krispr-approach-to-kubernetes-infrastructure-a0741cff4e0c">post</a>.</p><p id="4812" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In this post, we’ll talk about how we dynamically size our clusters using the Kubernetes Cluster Autoscaler, and highlight functionality we’ve contributed to the <a class="au kv" href="https://github.com/kubernetes/community/tree/master/sig-autoscaling" rel="noopener ugc nofollow" target="_blank">sig-autoscaling community</a>. These improvements add customizability and flexibility to meet Airbnb’s unique business requirements.</p><h1 id="d902" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Kubernetes Clusters at Airbnb</h1><p id="8428" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Over the past few years, Airbnb has shifted almost all online services from manually orchestrated EC2 instances to Kubernetes. Today, we run thousands of nodes across nearly a hundred clusters to accommodate these workloads. However, this change didn’t happen overnight. During this migration, our underlying Kubernetes cluster setup evolved and became more sophisticated as more workloads and traffic shifted to our new technology stack. This evolution can be split into three stages.</p><p id="62e6" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Stage 1: Homogenous Clusters, Manual Scaling</p><p id="bae1" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Stage 2: Multiple Cluster Types, Independently Autoscaled</p><p id="e56e" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Stage 3: Heterogeneous Clusters, Autoscaled</p><h2 id="c760" class="ml lj ja bn lk mm mn mo lo mp mq mr ls ki ms mt lw km mu mv ma kq mw mx me my gc">Stage 1: Homogenous Clusters, Manual Scaling</h2><p id="c986" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Before using Kubernetes, each instance of a service was run on its own machine, and manually scaled to have the proper capacity to handle traffic increases. Capacity management varied per team and capacity would rarely be un-provisioned once load dropped.</p><p id="28c0" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Our initial Kubernetes cluster setup was relatively basic. We had a handful of clusters, each with a single underlying node type and configuration, which ran only stateless online services. As some of these services began shifting to Kubernetes, we started running containerized services in a multi-tenant environment (many pods on a node). This aggregation led to fewer wasted resources, and consolidated capacity management for these services to a single control point at the Kuberentes control plane. At this stage, we scaled our clusters manually, but this was still a marked improvement over the previous situation.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go mz"><img alt="An EC2 node running a single application vs. a Kubernetes node running 3 applications." class="cf lg lh" src="https://miro.medium.com/max/1400/0*xgJUXKfck5DuQOg1" width="700" height="491" /></div></div><figcaption class="na bm gp gn go nb nc bn b bo bp co">Figure 1: EC2 Nodes vs Kubernetes Nodes</figcaption></figure><h2 id="3b1d" class="ml lj ja bn lk mm mn mo lo mp mq mr ls ki ms mt lw km mu mv ma kq mw mx me my gc">Stage 2: Multiple Cluster Types, Independently Autoscaled</h2><p id="3159" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">The second stage of our cluster configuration began when more diverse workload types, each with different requirements, sought to run on Kubernetes. To accommodate their needs, we created a cluster type abstraction. A “cluster type” defines the underlying configuration for a cluster, meaning that all clusters of a cluster type are identical, from node type to different cluster component settings.</p><p id="2341" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">More cluster types led to more clusters, and our initial strategy of manually managing capacity of each cluster quickly fell apart. To remedy this, we added the Kubernetes <a class="au kv" href="https://github.com/kubernetes/autoscaler" rel="noopener ugc nofollow" target="_blank">Cluster Autoscaler</a> to each of our clusters. This component automatically adjusts cluster size based on pod requests — if a cluster’s capacity is exhausted, and a pending pod’s request could be filled by adding a new node, Cluster Autoscaler launches one. Similarly, if there are nodes in a cluster that have been underutilized for an extended period of time, Cluster Autoscaler will remove these from the cluster. Adding this component worked beautifully for our setup, saved us roughly 5% of our total cloud spend, and the operational overhead of manually scaling clusters.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go nd"><img alt="Different Kubernetes clusters for different types of applications (CPU-bound or GPU-bound applications, for example)." class="cf lg lh" src="https://miro.medium.com/max/1400/0*XevtJSPUAo9vTpJn" width="700" height="459" /></div></div><figcaption class="na bm gp gn go nb nc bn b bo bp co">Figure 2: Kubernetes Cluster Types</figcaption></figure><h2 id="2285" class="ml lj ja bn lk mm mn mo lo mp mq mr ls ki ms mt lw km mu mv ma kq mw mx me my gc">Stage 3: Heterogeneous Clusters, Autoscaled</h2><p id="4605" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">When nearly all online compute at Airbnb shifted to Kubernetes, the number of cluster types had grown to over 30, and the number of clusters to 100+. This expansion made Kubernetes cluster management tedious. For example, cluster upgrades had to be individually tested on each of our numerous cluster types.</p><p id="2187" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In this third phase, we aimed to consolidate our cluster types by creating “heterogeneous” clusters that could accommodate many diverse workloads with a single Kubernetes control plane. First, this greatly reduces cluster management overhead, as having fewer, more general purpose clusters reduces the number configurations to test. Second, with the majority of Airbnb now running on our Kubernetes clusters, efficiency in each cluster provides a big lever to reduce cost. Consolidating cluster types allows us to run varied workloads in each cluster. This aggregation of workload types — some big and some small — can lead to better bin packing and efficiency, and thus higher utilization. With this additional workload flexibility, we had more room to implement sophisticated scaling strategies, outside of the default Cluster Autoscaler expansion logic. Specifically, we aimed to implement scaling logic that was tied to Airbnb specific business logic.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go nd"><img alt="A single Kubernetes cluster with multiple different node types: an Intel compute node, an AMD compute node, a high-memory node, and a GPU node." class="cf lg lh" src="https://miro.medium.com/max/1400/0*1GUcmg4jijWf_fdA" width="700" height="506" /></div></div><figcaption class="na bm gp gn go nb nc bn b bo bp co">Figure 3: A heterogeneous Kubernetes cluster</figcaption></figure><p id="8eac" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">As we scaled and consolidated clusters so they were heterogeneous (multiple instance types per cluster), we began to implement specific business logic during expansion and realized some changes to the autoscaling behavior were necessary. The next section will describe some of the changes we’ve made to Cluster Autoscaler to make it more flexible.</p><h1 id="2481" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Cluster Autoscaler Improvements</h1><h2 id="c516" class="ml lj ja bn lk mm mn mo lo mp mq mr ls ki ms mt lw km mu mv ma kq mw mx me my gc">Custom gRPC Expander</h2><p id="9961" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">The most significant improvement we made to Cluster Autoscaler was to provide a new method for determining node groups to scale. Internally, Cluster Autoscaler maintains a list of node groups which map to different candidates for scaling, and it filters out node groups that do not satisfy pod scheduling requirements by running a scheduling simulation against the current set of Pending (unschedulable) pods. If there are any Pending (unschedulable) pods, Cluster Autoscaler attempts to scale the cluster to accommodate these pods. Any node groups that satisfy all pod requirements are passed to a component called the <a class="au kv" href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-expanders" rel="noopener ugc nofollow" target="_blank">Expander</a>.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go nd"><img alt="A depiction of Cluster Autoscaler, which calls the Expander to determine which type of node to add in a heterogeneous Kubernetes cluster." class="cf lg lh" src="https://miro.medium.com/max/1400/0*ryQyolVdPY6bbSQy" width="700" height="391" /></div></div><figcaption class="na bm gp gn go nb nc bn b bo bp co">Figure 4: Cluster Autoscaler and Expander</figcaption></figure><p id="d56a" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The Expander is responsible for further filtering the node groups based on operational requirements. Cluster Autoscaler has a number of different built-in expander options, each with different logic. For example, the default is the random expander, which selects from available options uniformly at random. Another option,and the one that Airbnb has historically used, is the <a class="au kv" href="https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/expander/priority" rel="noopener ugc nofollow" target="_blank">priority expander</a>, which chooses which node group to expand based on a user-specified tiered priority list.</p><p id="d52e" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">As we moved toward our heterogeneous cluster logic, we found that the default expanders were not sophisticated enough to satisfy our more complex business requirements around cost and instance type selection.</p><p id="11f6" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">As a contrived example, say we want to implement a weighted priority expander. Currently, the priority expander only lets users specify distinct tiers of node groups, meaning it will always expand tiers deterministically and in order. If there are multiple node groups in a tier, it will break ties randomly. A weighted priority strategy of setting two node groups in the same tier, but expanding one 80% of the time, and another 20% of the time, is not achievable with the default setup.</p><p id="1959" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Outside of the limitations of the current supported expanders, there were a few operational concerns:</p><ol class=""><li id="53e6" class="ne nf ja jz b ka kb ke kf ki ng km nh kq ni ku nj nk nl nm gc">Cluster Autoscaler’s release pipeline is rigorous and changes take time to review before being merged upstream. However, our business logic and desired scaling strategy is continuously changing. Developing an expander to fill our current needs today may not fulfill our needs in the future</li><li id="17b4" class="ne nf ja jz b ka nn ke no ki np km nq kq nr ku nj nk nl nm gc">Our business logic is specific to Airbnb and not necessarily other users. Any changes we implement specific to our logic would not be useful to contribute back upstream</li></ol><p id="34cc" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">From these, we came up with a set of requirements for a new expander type in Cluster Autoscaler:</p><ol class=""><li id="67c3" class="ne nf ja jz b ka kb ke kf ki ng km nh kq ni ku nj nk nl nm gc">We wanted something that was both extensible and usable by others. Others may run into similar limitations with the default Expanders at scale, and we would like to provide a generalized solution and contribute functionality back upstream</li><li id="bf27" class="ne nf ja jz b ka nn ke no ki np km nq kq nr ku nj nk nl nm gc">Our solution should be deployable out of band with Cluster Autoscaler, and allow us to respond more rapidly to changing business needs</li><li id="dae2" class="ne nf ja jz b ka nn ke no ki np km nq kq nr ku nj nk nl nm gc">Our solution should fit into the Kubernetes Cluster Autoscaler ecosystem, so that we do not have to maintain a fork of Cluster Autoscaler indefinitely</li></ol><p id="c8ea" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">With these requirements, we came up with a design that breaks out the expansion responsibility from the Cluster Autoscaler core logic. We designed a pluggable “<a class="au kv" href="https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/expander/grpcplugin" rel="noopener ugc nofollow" target="_blank">custom Expander</a>.” which is implemented as a gRPC client (similarly to the <a class="au kv" href="https://github.com/kubernetes/autoscaler/blob/68c984472acce69cba89d96d724d25b3c78fc4a0/cluster-autoscaler/proposals/plugable-provider-grpc.md" rel="noopener ugc nofollow" target="_blank">custom cloud provider</a>). This custom expander is broken into two components.</p><p id="38a8" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The first component is a gRPC client built into Cluster Autoscaler. This Expander conforms to the same interface as other Expanders in Cluster Autoscaler, and is responsible for transforming information about valid node groups from Cluster Autoscaler to a defined <a class="au kv" href="https://developers.google.com/protocol-buffers/docs/overview" rel="noopener ugc nofollow" target="_blank">protobuf</a> schema (shown below), and receives the output from the gRPC server to transform back to a final list of options for Cluster Autoscaler to scale up.</p><pre class="kx ky kz la gz ns bt nt">service Expander {<br />  rpc BestOptions (BestOptionsRequest) returns (BestOptionsResponse) <br />}message BestOptionsRequest {<br />  repeated Option options;<br />  map&lt;string, k8s.io.api.core.v1.Node&gt; nodeInfoMap;<br />}message BestOptionsResponse {<br />  repeated Option options;<br />}message Option {<br />  // ID of node to uniquely identify the nodeGroup<br />  string nodeGroupId;<br />  int32 nodeCount;<br />  string debug;<br />  repeated k8s.io.api.core.v1.Pod pod;<br />}</pre><p id="9fd5" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">The second component is the gRPC server, which is left up to the user to write. This server is intended to be run as a separate application or service, which can run arbitrarily complex expansion logic when selecting which node group to scale up, with the given information passed from the client. Currently, the protobuf messages passed over gRPC are slightly transformed versions of what is passed to the Expander in Cluster Autoscaler.</p><p id="219b" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">From our aforementioned example, a weighted random priority expander can be implemented fairly easily by having the server read from a priority tier list and weighted percentage configuration from a configmap, and choose accordingly.</p><figure class="kx ky kz la gz lb gn go paragraph-image"><div role="button" tabindex="0" class="lc ld dq le cf lf"><div class="gn go nd"><img alt="" class="cf lg lh" src="https://miro.medium.com/max/1400/0*MldTcDs1Df38IfHE" width="700" height="352" role="presentation" /></div></div><figcaption class="na bm gp gn go nb nc bn b bo bp co">Figure 5: Cluster Autoscaler and Custom gRPC Expander</figcaption></figure><p id="11a2" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Our implementation includes a failsafe option. It is recommended to use the option to pass in <a class="au kv" href="https://github.com/kubernetes/autoscaler/pull/4233" rel="noopener ugc nofollow" target="_blank">multiple expanders</a> as arguments to Cluster Autoscaler. With this option, if the server fails, Cluster Autoscaler is still able to expand using a fallback Expander.</p><p id="0d68" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Since it runs as a separate application, expansion logic can be developed out of band with Cluster Autoscaler, and since the gRPC server is customizable by the user based on their needs, this solution is extensible and useful to the wider community as a whole.</p><p id="e020" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Internally, Airbnb has been using this new solution to scale all of our clusters without issues since the beginning of 2022. It has allowed us to dynamically choose when to expand certain node groups to meet Airbnb’s business needs, thus achieving our initial goal of developing an extensible custom expander.</p><p id="0330" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Our custom expander was <a class="au kv" href="https://github.com/kubernetes/autoscaler/pull/4452" rel="noopener ugc nofollow" target="_blank">accepted</a> into the upstream Cluster Autoscaler earlier this year, and will be available to use in the next version (v1.24.0) release.</p><h1 id="7664" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Other Autoscaler Improvements</h1><p id="8ab3" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">Over the course of our migration to heterogeneous Kubernetes clusters, we identified a number of other bugs and improvements that could be made to Cluster Autoscaler. These are briefly described below:</p><ul class=""><li id="eb5e" class="ne nf ja jz b ka kb ke kf ki ng km nh kq ni ku od nk nl nm gc"><a class="au kv" href="https://github.com/kubernetes/autoscaler/pull/4489" rel="noopener ugc nofollow" target="_blank">Early abort for AWS ASGs with no capacity</a>: Short circuit the Cluster Autoscaler loop to wait for nodes it tries to spin up to see if they are ready by calling out to an AWS EC2 endpoint to check if the ASG has capacity. With this change enabled, users get much more rapid, yet correct scaling. Previously, users using a priority ladder would have to wait 15 minutes between each attempted ASG launch, before trying an ASG of lower priority.</li><li id="119a" class="ne nf ja jz b ka nn ke no ki np km nq kq nr ku od nk nl nm gc"><a class="au kv" href="https://github.com/kubernetes/autoscaler/pull/4073" rel="noopener ugc nofollow" target="_blank">Caching launch templates to reduce AWS API calls</a>: Introduce a cache for AWS ASG Launch Templates. This change unlocks using large numbers of ASGs, which was critical for our generalized cluster strategy. Previously, for empty ASGs (no present nodes in a cluster), Cluster Autoscaler would repeatedly call an AWS endpoint to get launch templates, resulting in throttling from the AWS API.</li></ul><h1 id="254c" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Conclusion</h1><p id="4827" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">In the last four years, Airbnb has come a long way in our Kubernetes Cluster setup. Having the largest portion of compute at Airbnb on a single platform provided a strong, consolidated lever to improve efficiency, and we are now focused on generalizing our cluster setup (think <a class="au kv" href="http://cloudscaling.com/blog/cloud-computing/the-history-of-pets-vs-cattle/" rel="noopener ugc nofollow" target="_blank">“cattle, not pets”</a>). By developing and using a more sophisticated expander in Cluster Autoscaler (as well as fixing a number of other minor issues with the Autoscaler), we have been able to achieve our goals of developing our complex, business specific scaling strategy around cost and mixed instance types, while also contributing some useful features back to the community.</p><p id="93ce" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">For more details on our heterogeneous cluster migration, watch our Kube-Con <a class="au kv" href="https://www.youtube.com/watch?v=GCCSY7ERXj4&amp;ab_channel=CNCF%5BCloudNativeComputingFoundation%5D" rel="noopener ugc nofollow" target="_blank">talk</a> and we’re also at KubeCon EU this year, come talk to us! If you’re interested in working on interesting problems like the ones we’ve described here, we’re hiring! Check out these open roles:</p><p id="c312" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/3949745/" rel="noopener ugc nofollow" target="_blank">Engineering Manager- Infrastructure</a></p><p id="66c0" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/3903900/" rel="noopener ugc nofollow" target="_blank">Senior Front End Engineer</a></p><p id="2a45" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/2623004/" rel="noopener ugc nofollow" target="_blank">Senior Engineer, Cloud Infrastructure</a></p><p id="25aa" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/4168852/" rel="noopener ugc nofollow" target="_blank">Software Engineer, Observability</a></p><p id="1f60" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au kv" href="https://careers.airbnb.com/positions/3696687/" rel="noopener ugc nofollow" target="_blank">Software Engineer, Developer Infrastructure</a></p><h1 id="c684" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">Acknowledgements</h1><p id="cf62" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc">The evolution of our Kubernetes Cluster setup is the work of many different collaborators. Special thanks to Stephen Chan, Jian Cheung, Ben Hughes, Ramya Krishnan, David Morrison, Sunil Shah, Jon Tai and Long Zhang, as this work would not have been possible without them.</p><h1 id="d475" class="li lj ja bn lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf gc">****************</h1><p id="8ace" class="pw-post-body-paragraph jx jy ja jz b ka mg kc kd ke mh kg kh ki mi kk kl km mj ko kp kq mk ks kt ku it gc"><em class="oe">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/dynamic-kubernetes-cluster-scaling-at-airbnb-d79ae3afa132</link>
      <guid>https://medium.com/airbnb-engineering/dynamic-kubernetes-cluster-scaling-at-airbnb-d79ae3afa132</guid>
      <pubDate>Mon, 23 May 2022 19:35:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey to Airbnb — Kamini Dandapani]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@airbnbeng?source=post_page-----7f51f1fbb2bb--------------------------------"><div class="l dq"><img alt="AirbnbEng" class="l ci fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/1*PrgppbVAePgtuFs2XZa8Ig.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gc"><div class="hd o he"><div><div class="cj" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@airbnbeng?source=post_page-----7f51f1fbb2bb--------------------------------">AirbnbEng</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bn b bo bp co">May 11</p><div class="hw cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">5 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="id o ao"></div><div class="cl ij"><div></div></div></div><div class="ik il im j i d"><div class="fl l"><div class="ir l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ir l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ir l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="jb jc jd je jf"><div class=""><p id="d7c9" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">Airbnb’s VP of Engineering on why you don’t have to change your natural self to be a leader</p><figure class="le lf lg lh gz li gn go paragraph-image"><div role="button" tabindex="0" class="lj lk dq ll cf lm"><div class="gn go ld"><img alt="" class="cf ln lo" src="https://miro.medium.com/max/1400/0*t-dDS7QYW1gsBtG5" width="700" height="626" role="presentation" /></div></div></figure><p id="ca7b" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc"><a class="au lp" href="https://www.linkedin.com/in/kaminidandapani/" rel="noopener ugc nofollow" target="_blank">Kamini Dandapani</a>, VP of Engineering at Airbnb, leads the Infrastructure Engineering organization, which is in many ways the backbone of the company: responsible for powering the systems that keep Airbnb running smoothly and help new products reach millions of people. With a passion for how platforms can support and sustain the business and product, Kamini developed her considerate and welcoming leadership style at eBay and LinkedIn before joining Airbnb two years ago. In addition to her Infra role, she champions diversity and belonging in the workplace and is co-sponsor for Airbnb’s tech diversity council, which aims to create the most diverse and inclusive community in the tech industry.</p><p id="167f" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc"><em class="lq">Want to hear Kamini and other Infrastructure team members talk about some of the team’s latest projects? Check out the </em><a class="au lp" href="https://www.facebook.com/AirbnbTech/videos/635338454172729/" rel="noopener ugc nofollow" target="_blank"><em class="lq">“Powering Our Platform” Airbnb Tech Talk</em></a><em class="lq"> from March 2022. You’ll hear about some of the major initiatives we’re working on in next-generation service mesh, observability, feature engineering, and scalable storage.</em></p><h1 id="0d4d" class="lr ls ji bn lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm mn mo gc">From Chennai to Chicago</h1><p id="bbd6" class="pw-post-body-paragraph kf kg ji kh b ki mp kk kl km mq ko kp kq mr ks kt ku ms kw kx ky mt la lb lc jb gc">Growing up in India, I was the youngest of three girls. Despite facing skepticism and criticism from others around them, my parents invested heavily in our education and gave us a very strong footing, without which I don’t think I would be where I am today.</p><p id="6884" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">I started familiarizing myself with the engineering world, and found that I immensely enjoyed it. I did my undergrad in electronics and communication, and with my dad’s encouragement — he camped out overnight in the line in front of the US Consulate to get a visa — I came to the US to pursue my master’s in computer science.</p><p id="de53" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">In Chicago, I had to adjust to a lot of new experiences (including the winter cold!). In India, I never did anything alone, but here I had to do everything independently, from managing my finances to driving a car. After I graduated, I felt very fortunate to get a job in Silicon Valley, and I’ve stayed here ever since.</p><h1 id="8c6b" class="lr ls ji bn lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm mn mo gc">Leading at the intersection of platform and product</h1><p id="fe47" class="pw-post-body-paragraph kf kg ji kh b ki mp kk kl km mq ko kp kq mr ks kt ku ms kw kx ky mt la lb lc jb gc">Effective infrastructure can’t be built in a vacuum. Rather, it requires close partnerships with our product engineers to support both our product and overall business strategies. My professional sweet spot is where the platform architecture meets the end-user experience — plus scale!</p><p id="d6f7" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">In my engineering career, I worked at eBay for 12 years and grew into a director position, leading international expansion. After that, I was at LinkedIn for six years, leading infrastructure and tools for the consumer app, and that’s where I learned how to operate and develop a platform at scale. When Airbnb got in touch with me, I wasn’t looking for a change. But with every conversation that I had, there was something truly magical about the place — from the leadership, to the inclusivity, to the company’s mission — and I am so grateful that I made the leap.</p><p id="862e" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">What excited me most was bringing dozens of years of operating at scale to Airbnb. And one key component to operating at scale is working effectively and smoothly cross-functionally, and building close relationships with our product teams and key partners across the business. I’ve seen some truly incredible teamwork within my own team, and across all of Airbnb.</p><h1 id="59fa" class="lr ls ji bn lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm mn mo gc">Building the tech backbone of Airbnb</h1><p id="0e5d" class="pw-post-body-paragraph kf kg ji kh b ki mp kk kl km mq ko kp kq mr ks kt ku ms kw kx ky mt la lb lc jb gc">Most of the technical foundation that powers Airbnb comes from the Infrastructure organization. The impact that this group has is so wide and profound.</p><p id="f4a0" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">The Infrastructure organization has several key pillars:</p><ul class=""><li id="25fa" class="mu mv ji kh b ki kj km kn kq mw ku mx ky my lc mz na nb nc gc"><strong class="kh jj">Search Infrastructure</strong>, which powers the backend systems for our guest search experience</li><li id="a64d" class="mu mv ji kh b ki nd km ne kq nf ku ng ky nh lc mz na nb nc gc"><strong class="kh jj">Data Platform</strong>, for storing, processing and managing all the data that powers every user experience</li><li id="b0e0" class="mu mv ji kh b ki nd km ne kq nf ku ng ky nh lc mz na nb nc gc"><strong class="kh jj">Developer Platform</strong>, which helps make Airbnb engineers’ lives friction-free by building tools, services and environments to help them develop, build, test and deploy their code</li><li id="5d0a" class="mu mv ji kh b ki nd km ne kq nf ku ng ky nh lc mz na nb nc gc"><strong class="kh jj">Cloud Infrastructure,</strong> which delivers and operates the cloud environment that powers Airbnb</li><li id="69c5" class="mu mv ji kh b ki nd km ne kq nf ku ng ky nh lc mz na nb nc gc"><strong class="kh jj">Reliability Engineering, </strong>which remedies and prevents site performance issues through tooling and automation</li></ul><p id="9d4d" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">Within each of these areas, we have many long-term, multi-year projects, all part of what we’re calling Tech Stack 2.0: an evolution and modernization of our technology. Some sample initiatives include <a class="au lp" href="https://news.airbnb.com/unique-stays-hosts-earn-more-than-300-million-since-start-of-pandemic/" rel="noopener ugc nofollow" target="_blank">flexible search for guests</a> and UDS, our pioneering next-generation storage system.</p><h1 id="04c3" class="lr ls ji bn lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm mn mo gc">My identity: female, South Asian, immigrant</h1><p id="9223" class="pw-post-body-paragraph kf kg ji kh b ki mp kk kl km mq ko kp kq mr ks kt ku ms kw kx ky mt la lb lc jb gc">People often point out that I’m unique for being a female leader in tech. But in reality, there are three important aspects of my identity: yes, I’m a woman, but I’m also South Asian and an immigrant. All of these have shaped who I am today.</p><p id="9b15" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">I grew up in a very different culture. We were discouraged from challenging the status quo, and for my parents and grandparents, the idea was that if you work extremely hard, recognition will follow. That’s not the way it works here: it sometimes seems like you need to have an opinion and advocate for yourself in order to be taken seriously.</p><p id="5708" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">In many ways, I think being different is an advantage as a leader. While I encourage everyone on the team to make sure their voice is being heard, I also believe in being your natural self. That’s how I’ve been able to build trust with my teams, by letting them see the real me. My philosophy is that no one can be an expert in everything. What you’ll see is varying degrees in people — and I want to fully support that diversity of thought and experience, because a team that’s well-rounded is more effective.</p><h1 id="222d" class="lr ls ji bn lt lu lv lw lx ly lz ma mb mc md me mf mg mh mi mj mk ml mm mn mo gc">Bringing people along</h1><p id="06d2" class="pw-post-body-paragraph kf kg ji kh b ki mp kk kl km mq ko kp kq mr ks kt ku ms kw kx ky mt la lb lc jb gc">When joining Airbnb, I asked to have the dedicated time and agency to do work around diversity and gender parity. I’m now the co-sponsor for the Tech Diversity Council alongside <a class="au lp" rel="noopener" href="https://medium.com/airbnb-engineering/my-journey-to-airbnb-lucius-diphillips-79d1f0bc72a2">Lucius DiPhillips</a> (CIO), where we advocate for diversity-related projects around the Tech org. I’m also one of the advisors for our Asians@ employee resource group.</p><p id="d3e0" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">There’s something special about these employee resource groups here at Airbnb that I haven’t seen before. It’s a very small close-knit group, and we can relate to our similar upbringing and cultural norms. We genuinely look out for each other and amplify our Asian@ colleagues’ voices.</p><p id="2f74" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc">There’s a saying that “if you want to go fast, go alone, but if you want to go far, go together.” Whether it’s sharing context about our work, being vulnerable about my mistakes, or building a diverse organization, I very much believe in bringing people along. I couldn’t be at a better place than here at Airbnb, where our company’s mission is for anyone to belong anywhere.</p><p id="3913" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc"><em class="lq">Interested in working at Airbnb? We’re hiring! Check out these open roles:</em></p><p id="c2d7" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc"><a class="au lp" href="https://careers.airbnb.com/positions/3029584/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Distributed Storage</a></p><p id="7ec1" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc"><a class="au lp" href="https://careers.airbnb.com/positions/3903900/?gh_src=3da3a8881us" rel="noopener ugc nofollow" target="_blank">Senior Frontend Infrastructure Engineer, Web Platform</a></p><p id="16f3" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc"><a class="au lp" href="https://careers.airbnb.com/positions/2410642/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Cloud Infrastructure</a></p><p id="2793" class="pw-post-body-paragraph kf kg ji kh b ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc jb gc"><a class="au lp" href="https://careers.airbnb.com/positions/3747712/" rel="noopener ugc nofollow" target="_blank">Senior/Staff Backup and Recovery Engineer, Storage Infrastructure</a></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-kamini-dandapani-7f51f1fbb2bb</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-kamini-dandapani-7f51f1fbb2bb</guid>
      <pubDate>Wed, 11 May 2022 21:35:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Continuous Delivery at Airbnb]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@jvanderhaeghe?source=post_page-----6ac042bc7876--------------------------------"><div class="l dq"><img alt="jens vanderhaeghe" class="l ci fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/0*cgitQyoogwQC8AFM.png" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gc"><div class="hd o he"><div><div class="cj" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@jvanderhaeghe?source=post_page-----6ac042bc7876--------------------------------">jens vanderhaeghe</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bn b bo bp co">Apr 22</p><div class="hw cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">7 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div></div><div class="cl ie"><div></div></div></div><div class="if ig ih j i d"><div class="ii l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cj" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><figure class="gr gt jy jz ka kb gn go paragraph-image"><div role="button" tabindex="0" class="kc kd dq ke cf kf"><div class="gn go jx"><img alt="" class="cf kg kh" src="https://miro.medium.com/max/1400/1*_bvb5WtcQRE3mL-32b0F4g@2x.png" width="700" height="466" role="presentation" /></div></div></figure><p id="8a04" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc"><a class="au lg" href="https://www.linkedin.com/in/jensvanderhaeghe" rel="noopener ugc nofollow" target="_blank">Jens Vanderhaeghe</a>,<a class="au lg" href="mailto:manish.maheshwari@airbnb.com" rel="noopener ugc nofollow" target="_blank">Manish Maheshwari</a></p><h1 id="2330" class="lh li ja bn lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me gc">Introduction</h1><p id="b323" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc">Over the years, Airbnb’s tech stack has <a class="au lg" rel="noopener" href="https://medium.com/airbnb-engineering/building-services-at-airbnb-part-1-c4c1d8fa811b">shifted</a> from a monolith to 1,000+ services in our service-oriented architecture (SOA). While this migration solved our problems scaling our application architecture, it also introduced an array of new challenges.</p><p id="282b" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">In this blog post we’ll cover the deployment challenges faced on the road to our current architecture and how we’ve solved those problems by adopting Continuous Delivery best practices on top of <a class="au lg" href="https://spinnaker.io/" rel="noopener ugc nofollow" target="_blank">Spinnaker</a>. We’ll do a deep dive into how we’ve solved such a large scale migration in a short timespan while maintaining developer productivity along the way.</p><h1 id="b3ed" class="lh li ja bn lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me gc">From Deployboard to Spinnaker</h1><p id="97ac" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc"><a class="au lg" rel="noopener" href="https://medium.com/airbnb-engineering/introducing-deploy-pipelines-to-airbnb-fc804ac2a157">Deployboard</a>, Airbnb’s legacy deployment tool, was designed for a monolith having a few centrally managed pipelines. As we started moving to SOA, thousands of code changes across hundreds of service teams were being deployed. Deployboard was not designed for the SOA architecture, which is characterized by decentralized deployments. We needed something much more templated so that teams could quickly get a standard, best-practice pipeline, rather than start from scratch for every new service. Rather than continuing to build in-house solutions with siloed knowledge, it made the most sense for us to adopt open source solutions built from the ground-up for decentralized, SOA pipelines.</p><p id="8972" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">Spinnaker is proven at Airbnb’s scale, and beyond, by industry peers like Google and Netflix. We believe continuous delivery isn’t a problem unique to Airbnb, and decided we’d benefit from collaborating with the larger community. We chose Spinnaker as the replacement for Deployboard in part because we could bridge functionality gaps by plugging in custom logic easily, without forking the core code. Also, it was important to us that Spinnaker automated canary analysis (ACA), an extremely effective strategy in reducing the blast radius of buggy deployments.</p><h1 id="ead5" class="lh li ja bn lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me gc">Migrating to Spinnaker</h1><p id="b1b5" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc">When deciding to switch rather than evolve, we created a new problem: How do we get a globally distributed team of thousands of engineers working on thousands of services (each with their own deploy pipeline), working under business pressure to continuously improve their product and code base, to change one of the most important tools they depend on for day-to-day productivity.</p><p id="f6d5" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">We were particularly worried about the “long-tail migration problem,” where we successfully get 80% of the services migrated in the first year or so, but the remaining ones become stuck indefinitely on the old system. Having to operate in such a hybrid mode is costly, and it also is a reliability and even security risk, because the “legacy” systems (including the legacy deploy system) receive less and less attention over time.</p><p id="59fb" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">Rather than forcing yet another new tool on our engineers, we came up with a migration strategy based on three pillars: focus on the benefits, automated onboarding, and data.</p><figure class="ml mm mn mo gz kb gn go paragraph-image"><div role="button" tabindex="0" class="kc kd dq ke cf kf"><div class="gn go mk"><img alt="" class="cf kg kh" src="https://miro.medium.com/max/1400/0*2fjFQ8VIEm7LD_Iz" width="700" height="268" role="presentation" /></div></div><figcaption class="mp bm gp gn go mq mr bn b bo bp co">The 3 pillars of our migration strategy</figcaption></figure><h2 id="789a" class="ms li ja bn lj mt mu mv ln mw mx my lr kt mz na lv kx nb nc lz lb nd ne md nf gc">Focus on Benefits</h2><p id="5c2d" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc">By focusing on the benefits of Spinnaker, we encouraged engineering teams to adopt Spinnaker voluntarily rather than forcing them.</p><p id="8e7b" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">We started out by manually onboarding a small group of early adopters. We identified a set of services that were prone to causing incidents or had a complicated deployment process. By migrating these services onto Spinnaker and automating their release process using a deployment pipeline with ACA, we were quickly able to demonstrate value. As we onboarded more teams, we iterated on the feature gaps between Deployboard and Spinnaker. These early services served as case studies, proving to both the rest of engineering as well as leadership that adopting an automated and standardized deployment process provides huge benefits.</p><p id="c2f4" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">These early adopters saw benefits so significant that they ended up becoming evangelists for continuous delivery and Spinnaker, spreading the word to other teams organically.</p><h2 id="f966" class="ms li ja bn lj mt mu mv ln mw mx my lr kt mz na lv kx nb nc lz lb nd ne md nf gc">Automated Onboarding</h2><p id="e407" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc">As more and more services started adopting Spinnaker, the Continuous Delivery team could no longer keep up with demand. We switched gears and focused on building automated tooling to onboard services to Spinnaker.</p><p id="41b3" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">At Airbnb, we store configuration as code <a class="au lg" href="https://www.infoq.com/presentations/airbnb-kubernetes-services/" rel="noopener ugc nofollow" target="_blank">using a framework called OneTouch</a>. This allows engineers to make changes to the code as well as the infrastructure running their code in a single commit and in the same folder. All infrastructure changes are version controlled.</p><figure class="ml mm mn mo gz kb gn go paragraph-image"><div role="button" tabindex="0" class="kc kd dq ke cf kf"><div class="gn go ng"><img alt="Example of a codified Spinnaker pipeline" class="cf kg kh" src="https://miro.medium.com/max/1400/0*1fytVc5gswBQQYwZ" width="700" height="724" /></div></div><figcaption class="mp bm gp gn go mq mr bn b bo bp co">Example of a codified Spinnaker pipeline</figcaption></figure><p id="8d07" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">Following the OneTouch philosophy, we created an abstraction layer on top of Spinnaker that enables all continuous delivery configuration to be source controlled and managed by our existing tools and processes.</p><p id="8548" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">Today, when new services are created they get Spinnaker integration, including ACA, for free out of the box.</p><h2 id="f94a" class="ms li ja bn lj mt mu mv ln mw mx my lr kt mz na lv kx nb nc lz lb nd ne md nf gc">Data</h2><p id="c8fb" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc">In addition to focusing on the benefits and making it easy to onboard, we wanted to clearly communicate the value-add of adopting Spinnaker in a data-driven way. We automatically instrumented <a class="au lg" rel="noopener" href="https://medium.com/airbnb-engineering/supercharging-apache-superset-b1a2393278bd">Superset</a> dashboards for each service that adopted Spinnaker.</p><figure class="ml mm mn mo gz kb gn go paragraph-image"><div role="button" tabindex="0" class="kc kd dq ke cf kf"><div class="gn go nh"><img alt="" class="cf kg kh" src="https://miro.medium.com/max/1400/0*UaIkauaDzJcPZuOZ" width="700" height="291" role="presentation" /></div></div></figure><p id="f3d7" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">Example of an instrumented dashboard for a service that has adopted Spinnaker</p><p id="9dad" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">Service owners get insight into deployment data like deploy frequency and number of regressions prevented by ACA. Most service owners saw a significant increase in deployment frequency and a marked decrease in production incidents by adopting our new tooling. By arming our users with the right data, they can more easily advocate for the benefits of adopting continuous delivery.</p><h1 id="90fb" class="lh li ja bn lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me gc">Clearing the final hurdle</h1><p id="4ffc" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc">As expected, we eventually hit an inflection point in adoption. Organic adoption slowed as we reached ~85% of deployments being done on Spinnaker.</p><p id="ea12" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">Once we hit this point, it was time to switch our strategy again, to adopt the lagging services. Our plan consisted of the following steps.</p><ol class=""><li id="e93c" class="ni nj ja kk b kl km kp kq kt nk kx nl lb nm lf nn no np nq gc"><strong class="kk jb"><em class="nr">Stop the bleeding</em></strong> <br />The first thing we did is stop any new service from being deployed with Deployboard. This kept our list of remaining services to adopt static. We did this by giving engineers ample heads-up that this change was coming.</li><li id="7053" class="ni nj ja kk b kl ns kp nt kt nu kx nv lb nw lf nn no np nq gc"><strong class="kk jb"><em class="nr">Announce deprecation date + increase friction </em></strong><br />We gradually increased friction when using Deployboard over Spinnaker by adding a banner and warning inside Deployboard. We also instituted an exemption process that would allow us to catch major blockers well before the actual deprecation date without hurting customer experience.</li><li id="e2cd" class="ni nj ja kk b kl ns kp nt kt nu kx nv lb nw lf nn no np nq gc"><strong class="kk jb"><em class="nr">Send out automated PRs for the remaining services. </em></strong><br />To ensure we could also help onboard services where owners are resource constrained we once again leveraged tools like our in-house refactor tool,Refactorator, to do the heavy lifting.</li><li id="dc2a" class="ni nj ja kk b kl ns kp nt kt nu kx nv lb nw lf nn no np nq gc"><strong class="kk jb">Deprecation date and post-deprecation follow-up. </strong><br />On deprecation date, we had code in place that blocked any OneTouch deploy from Deployboard. We had some loopholes in place in case there were services that still needed to use Deployboard for emergency reasons. The exemption list allows them to temporary get access to Deployboard. Engineers on the CD team can also still deploy with Deployboard, a simple page to the on-call can quickly help service owners in this case. As of today, the number of those cases remains very minimal given the amount of preparation we’ve done.</li></ol><figure class="ml mm mn mo gz kb gn go paragraph-image"><div role="button" tabindex="0" class="kc kd dq ke cf kf"><div class="gn go nh"><img alt="" class="cf kg kh" src="https://miro.medium.com/max/1400/0*vXsxUvTvvx0rP7gw" width="700" height="336" role="presentation" /></div></div><figcaption class="mp bm gp gn go mq mr bn b bo bp co">By adding a banner to Deployboard recommending engineers to adopt Spinnaker, we were able to drive adoption more quickly.</figcaption></figure><figure class="ml mm mn mo gz kb gn go paragraph-image"><div role="button" tabindex="0" class="kc kd dq ke cf kf"><div class="gn go nh"><img alt="" class="cf kg kh" src="https://miro.medium.com/max/1400/0*-QuiF4QGkdQN1MV_" width="700" height="497" role="presentation" /></div></div><figcaption class="mp bm gp gn go mq mr bn b bo bp co">Example of an automated Pull Request that migrates a service from Deployboard to Spinnaker with minimal engineering effort.</figcaption></figure><h1 id="92a6" class="lh li ja bn lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me gc">Future Plans and Opportunities</h1><p id="6216" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc">Now that we’ve standardized our deployment process, we’re excited to integrate various existing tools at Airbnb into our continuous delivery pipelines. In 2022 and beyond, we are investing resources into integrating automated load testing, providing a way to safely toggle feature flags, and enabling blue/green deployments to facilitate instant rollbacks. More broadly, we see Spinnaker not only as a tool for code deployments, but also for the automation of various manual processes, allowing engineers to orchestrate any arbitrary workload as a pipeline.</p><p id="831e" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">During our migration, we’ve made a ton of modifications, both large and small, to Spinnaker, which is a testament to how flexible the tool is. We will be focused on upgrading to the latest open-source version and are looking forward to contributing some of our changes back to the open-source community.</p><h1 id="1fac" class="lh li ja bn lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me gc">Conclusion</h1><p id="6c6c" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc">In our move from a monolithic architecture to SOA, we needed to rethink the way we do deployments at Airbnb.</p><p id="93ee" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">By creating a Continuous Delivery team focused on delivering great tools to safely and easily deploy code, we were able to migrate from our in-house tool, Deployboard, to Spinnaker. This was a very carefully planned and crafted migration. To adopt the majority of services, we focused on the benefits using a data-driven and automated approach to migration.</p><p id="77f1" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">As expected, there was a long tail of services that didn’t organically adopt our new tools. We were able to get to the 100% finish line by shifting our strategy towards adding more friction and eventually deprecating our old tool.</p><p id="7da1" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc">This migration now serves as a blueprint for other infrastructure related migrations at Airbnb and we look forward to continuing iterating on our strategies for bringing better tools to our engineers while maintaining existing productivity and reducing toil.</p><h1 id="913e" class="lh li ja bn lj lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me gc">Acknowledgments</h1><p id="04aa" class="pw-post-body-paragraph ki kj ja kk b kl mf kn ko kp mg kr ks kt mh kv kw kx mi kz la lb mj ld le lf it gc">All of our achievements wouldn’t have been possible without support of the entire Continuous Delivery team: <a class="au lg" href="mailto:jerry.chung@airbnb.com" rel="noopener ugc nofollow" target="_blank">Jerry Chung</a>, Freddy Chen, Alper Kokmen, Brian Wolfe, <a class="au lg" href="mailto:dion.hagan@airbnb.com" rel="noopener ugc nofollow" target="_blank">Dion Hagan</a>, Ryan Zelen, Greg Foster, Jens Vanderhaeghe, Mohamed Mohamed, Jake Silver, Manish Maheshwari and Shylaja Ramachandra. The entire Developer Platform organization rallied behind this effort. We’re also grateful to the countless engineers at Airbnb that have adopted Spinnaker over the years and have provided us with valuable feedback. We’d also like to thank all of the people at our peer companies and volunteers who have spent countless hours working on the open source Spinnaker project.</p><p id="8f7d" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc"><strong class="kk jb"><em class="nr">Interested in working at Airbnb? Check out these open roles:</em></strong></p><p id="ed40" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc"><a class="au lg" href="https://careers.airbnb.com/positions/3696687/?gh_src=08eeee991us" rel="noopener ugc nofollow" target="_blank">Senior/Staff Software Engineer, Developer Infrastructure</a><br /><a class="au lg" href="https://careers.airbnb.com/positions/3903900/?gh_src=e91bd0291us" rel="noopener ugc nofollow" target="_blank">Senior Frontend Infrastructure Engineer, Web Platform</a></p><p id="7058" class="pw-post-body-paragraph ki kj ja kk b kl km kn ko kp kq kr ks kt ku kv kw kx ky kz la lb lc ld le lf it gc"><em class="nr">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/continuous-delivery-at-airbnb-6ac042bc7876</link>
      <guid>https://medium.com/airbnb-engineering/continuous-delivery-at-airbnb-6ac042bc7876</guid>
      <pubDate>Fri, 22 Apr 2022 19:09:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey to Airbnb — Florian Andes]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gq gr gs gt gu gv gw gx gy gz l"><div class="o ha u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@airbnbeng?source=post_page-----5080685262d3-----------------------------------"><div class="l ct"><img alt="AirbnbEng" class="l dv fn hb hc fr" src="https://miro.medium.com/fit/c/96/96/1*PrgppbVAePgtuFs2XZa8Ig.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bo b cr cs gc"><div class="hd o he"><div><div class="cu" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@airbnbeng?source=post_page-----5080685262d3-----------------------------------">AirbnbEng</a></div></div><div class="hf hg hh hi hj d"></div></div><div class="o ao hv"><p class="pw-published-date bo b bp bq br">Apr 12</p><div class="hw cu" aria-hidden="true">·</div><div class="pw-reading-time bo b bp bq br">5 min read</div></div></div></div><div class="o ao"><div class="h k hx hy hz"><div class="ia l ft"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="ia l ft"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div></div><div class="bl ie"><div></div></div></div><div class="if ig ih j i d"><div class="ii l ft"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="ii l ft"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="l ft"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="it iu iv iw ix"><div class=""><p id="f3f4" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">From building airplanes to Staff Technical Program Manager at Airbnb</p><figure class="kw kx ky kz gz la gn go paragraph-image"><div role="button" tabindex="0" class="lb lc ct ld ea le"><div class="gn go kv"><img alt="" class="ea lf lg" src="https://miro.medium.com/max/1400/0*gHm8C-tJ3j2bavkT" width="700" height="467" role="presentation" /></div></div></figure><p id="6909" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au lh" href="https://www.linkedin.com/in/floandes/" rel="noopener ugc nofollow" target="_blank"><em class="li">Florian Andes</em></a><em class="li"> is a Staff Technical Program Manager at Airbnb. He has over 10 years of experience that spans the software, manufacturing, and strategy consulting industry. He studied in Frankfurt, London, Singapore, and Boston, where he received a bachelor’s and MBA degree in Business and Entrepreneurship.</em></p><p id="03d1" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><em class="li">Though it can be hard and intimidating to find your place in the “big tech” industry in Silicon Valley, Florian has relied on curiosity and openness to establish a successful career at Airbnb. Read on for Florian’s own words on working at the intersection of business and software engineering, transferring to the US and scaling tech programs from zero to 10x for Airbnb, and more.</em></p><h1 id="463e" class="lj lk ja bo ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">A global citizen finding a home for his career</h1><p id="3240" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">Many years ago, Airbnb had a tagline that really inspired me: “Don’t Go There. Live There.” I’ve tried to embrace that idea as much as I could. I’ve had the chance to live and work in several different countries — Germany, the UK, Singapore, China, and now the US, where I’m currently based in San Francisco.</p><p id="d028" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">I grew up in Southern Germany, specifically a smaller city called Ulm (which is famous for Albert Einstein, and for having the tallest church building in the world). Growing up, my dad was a great inspiration for me. In his twenties, he started his own business. His journey taught me a lot about perseverance, dedication, and visionary thinking. I’m still inspired by how he pioneered the intersection of hardware and software engineering in the very early days of computers.</p><p id="c3c4" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">In my career, I’ve broadly moved from hardware to software engineering across different industries. (My first job was folding pizza boxes at a friend’s restaurant.) In Germany, there’s a large hardware and manufacturing presence: cars, industrial machines, and appliances in general. I started off at Airbus, building airplanes with advanced plastics and fiber materials, and then moved into the automotive industry and strategy consulting in the mobility space.</p><p id="85f1" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">Along the way, I became really interested in software companies and internet technology. In Singapore, where I studied for my MBA, I joined a fast-growing tech startup to manage their partnerships across APAC. I loved it. I was flying around Southeast Asia to speak at startup events, pitch to investors, explain the idea and what problem we were solving — it was all very exciting to me.</p><p id="750f" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">I ended up attending a fireside chat with Mike Curtis, Airbnb’s VP of Engineering at that time, hosted by a local startup co-working space, where he shared lessons he picked up on his journey. I was inspired and ended up connecting with Mike and some folks from Airbnb after the talk, and the opportunity for me to join Airbnb grew organically from there.</p><h1 id="6d68" class="lj lk ja bo ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Pioneering projects at the intersection of tech and business</h1><p id="6939" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">TPMs are involved in every major release at Airbnb. Some TPMs are more product and business-focused (that’s me), and others are more infrastructure and platform-oriented. The potential for impact is high because you’re often very close to engineering leaders. You have conversations with the CTO, and many senior leaders in engineering — and on our product teams, as well.</p><p id="fc59" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">I’ve been with Airbnb now for more than five years, and over time, my work has expanded in technical depth and also in breadth. I started out with a focus on business operations and strategy in EMEA (based in Berlin), then transferred to the US to build out an entirely new API program from scratch, and I’m now overseeing all technical programs related to hosting products at Airbnb. It’s a really interesting area that bridges the gap between the tech (engineering, product, data science, and design) and commercial parts of our business.</p><p id="fb2a" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">There are two projects I have been involved in recently that demonstrate the breadth of our programs. One is the adoption of our <a class="au lh" href="https://airbnb.design/building-a-visual-language/" rel="noopener ugc nofollow" target="_blank">Design Language System</a> (DLS), a repository of prebuilt components that designers and engineers can reuse instead of building something new. So whenever an engineer at Airbnb has to implement a button, they don’t have to build this from scratch. Another is reducing our <a class="au lh" rel="noopener" href="https://medium.com/airbnb-engineering/our-journey-towards-cloud-efficiency-9c02ba04ade8">AWS costs</a>. We introduced a new attribution model at Airbnb that directly associates AWS costs to different teams, and my role was monitoring our organization’s consumption and championing a lot of cultural change to think about AWS cost-efficiency whenever we build products.</p><h1 id="d7a8" class="lj lk ja bo ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">Strategies for remote program management</h1><p id="6481" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">Recently a big focus has been, naturally, keeping teams engaged and collaborative during remote work. Communicating a clear program vision and using frameworks to keep projects on track are two strategies that are more important than ever.</p><p id="d39c" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">One other thing that I consider critical is celebrating wins and key milestones. Before, when we were in the office, it was much easier to celebrate big wins. At Airbnb, we have a tool where you can send appreciation to others. I use it all the time, because when I work with multiple stakeholders across design, product, engineering, and QA, people are generally contributing at different stages. When the project ends, you might not have this big forum anymore, because most people have moved on already. But it’s important to still give credit to the work that everyone contributed along the way and make sure their managers have visibility into their contributions.</p><h1 id="294a" class="lj lk ja bo ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg gc">What makes the TPM role unique at Airbnb</h1><p id="adce" class="pw-post-body-paragraph jx jy ja jz b ka mh kc kd ke mi kg kh ki mj kk kl km mk ko kp kq ml ks kt ku it gc">TPM is relatively new as a function itself, and every company approaches it slightly differently. Even at Airbnb, it’s continuously evolving and progressing. One thing that’s always the case, though, is that TPMs need to use influence to lead programs, teams, and products — without exercising direct authority. As a TPM, you have to champion ideas, be a great communicator, and work with your partners to align priorities and push projects forward.</p><p id="3d96" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">You don’t have to study software engineering to be a technical program manager — I think you need to show that you have the ability to grow and learn on the technical side, and this can happen before the job or on the job. For instance, at first I didn’t have much context about AWS optimization, frontend design language systems, or open source, but people at Airbnb are super open to sharing knowledge which allowed me to quickly onboard and take on these programs. You just need to be curious and open to learn.</p><p id="d472" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc">I love working at Airbnb because of the people and the mission-driven nature of the company. And I think what makes being a TPM at Airbnb extremely unique is that there’s still a lot of “greenfield” areas and unexplored territory, where you can really help define a new strategy and vision.</p><p id="f123" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><em class="li">Interested in joining Airbnb as a TPM? Check out these open roles:</em></p><p id="538d" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au lh" href="https://careers.airbnb.com/positions/4024213/?gh_src=6b8b81e61us" rel="noopener ugc nofollow" target="_blank">Senior TPM, Guest and Host Technology</a></p><p id="264f" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au lh" href="https://careers.airbnb.com/positions/3955056/?gh_src=8a794a6e1us" rel="noopener ugc nofollow" target="_blank">Staff TPM, Insurance Platform</a></p><p id="0713" class="pw-post-body-paragraph jx jy ja jz b ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku it gc"><a class="au lh" href="https://careers.airbnb.com/positions/3651016/?gh_src=77ab28041us" rel="noopener ugc nofollow" target="_blank">Staff TPM, Infrastructure Regionalization</a></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-florian-andes-5080685262d3</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-florian-andes-5080685262d3</guid>
      <pubDate>Tue, 12 Apr 2022 19:56:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Hacking Human Connection: the Story of Awedience]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gp gq gr gs gt gu gv gw gx gy l"><div class="o gz u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@avand?source=post_page-----ebf66ee6af0e-----------------------------------"><div class="l ct"><img alt="Avand Amiri" class="l dv fn ha hb" src="https://miro.medium.com/fit/c/96/96/1*uRf1xD16DZMiCJST8Svcyw.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bo b cr cs gb"><div class="hc o hd"><div><div class="cu" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@avand?source=post_page-----ebf66ee6af0e-----------------------------------">Avand Amiri</a></div></div><div class="he hf hg hh hi d"></div></div><div class="o ao hu"><p class="pw-published-date bo b bp bq br">Apr 5</p><div class="hv cu" aria-hidden="true">·</div><div class="pw-reading-time bo b bp bq br">11 min read</div></div></div></div><div class="o ao"><div class="h k hw hx hy"><div class="hz l fs"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="hz l fs"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="l fs"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div></div><div class="bl id"><div></div></div></div><div class="ie if ig j i d"><div class="ih l fs"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="ih l fs"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="ih l fs"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div><div class="l fs"><div><div class="cu" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div></div></div></div></div></div></div></header><section><div><div class="is it iu iv iw"><div class=""><div class=""><h2 id="db47" class="pw-subtitle-paragraph jw iy iz bo b jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn br">How a home-grown product helps Airbnb employees feel more connected during solitary times</h2></div><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn ko"><img alt="A screenshot of Awedience during an all-company meeting. Brian Chesky, CEO, is in the center and is surrounded by thousands of employees represented by tiny squares. Each square is either a picture of the employee, a color, or a letter that people have arranged to spell words or draw shapes. Emojis are captured emerging from some of the seats." class="ea ky kz" src="https://miro.medium.com/max/1400/1*u9pnag1uWfgwnT8tv42I1g.jpeg" width="700" height="408" /></div></div></figure><h1 id="440a" class="la lb iz bo lc ld le lf lg lh li lj lk kf ll kg lm ki ln kj lo kl lp km lq lr gb">Introduction</h1><p id="e9c3" class="pw-post-body-paragraph ls lt iz lu b lv lw ka lx ly lz kd ma mb mc md me mf mg mh mi mj mk ml mm mn is gb">This is the story of how Airbnb employees stayed connected during a time they had never felt more apart. In this post, you’ll learn how an idea turned into an internal product that is now a core part of how Airbnb operates.</p><p id="8b34" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">When you walk through the doors of an Airbnb office, you feel an energy that’s both inspiring and intimidating. After more than five years with the company, I explain this duality as Airbnb being both incredibly entrepreneurial and aspirational.</p><p id="e2da" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Airbnb company meetings are no different. Brian Chesky and his team keep our all-hands meetings exciting. I know what you’re thinking: “exciting meetings?!” But in all seriousness our all-hands are not just informative, they’re spectacular. Whether it’s drinking a smoothie of dehydrated bugs in solidarity with Engineering or eating spicy chicken wings in a Hot Ones-style Q&amp;A, our meetings are informative, energizing, and engaging. In somber moments, they’re human and heartfelt.</p><p id="fce1" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">The pandemic changed that. Feeling connected to the presenter, feeling connected to our peers, or feeling the presenter’s connection to the audience all vanished. Instead, we each separately watched the presenter streaming through a 16:9 rectangle on our laptops. Other people were watching concurrently (one could assume based on the invite) but it couldn’t be <em class="mt">felt</em>. Inspired, I set out to solve this, wondering,“<a class="au mu" href="https://en.wikipedia.org/wiki/The_Six_Million_Dollar_Man" rel="noopener ugc nofollow" target="_blank">we have the technology</a>, why can’t we see and interact with everyone watching live?”</p><h1 id="df83" class="la lb iz bo lc ld le lf lg lh li lj lk kf ll kg lm ki ln kj lo kl lp km lq lr gb">Inspiration</h1><p id="b305" class="pw-post-body-paragraph ls lt iz lu b lv lw ka lx ly lz kd ma mb mc md me mf mg mh mi mj mk ml mm mn is gb">It’s impossible to know under which circumstances inspiration will find us, and in hindsight it seems perfectly planned.</p><p id="e955" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Airbnb is a community based on connection and belonging, which is the reason so many of us came to work here. However, in March of 2020, as the spread of COVID-19 forced us to shelter in place, we struggled to find ways to preserve those traits within our company culture. What it meant to feel connected to the world and to each other was being redefined by emojis over video. Video chat helps us stay in communication, but it’s <a class="au mu" href="https://www.youtube.com/watch?v=DYu_bGbZiiQ" rel="noopener ugc nofollow" target="_blank">comically clunky</a>, dry, and lifeless. If these technologies feel contrived it’s because they are. We all know what an authentic human connection actually feels like, and it’s not that.</p><p id="997b" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">For almost two years before the pandemic hit, I volunteered to help produce Airbnb’s quarterly engineering all-hands. Unbeknownst to me, I was actually studying the question, “how does one make people feel more connected?” Whether it meant hosting an orca-themed relay race across our Portland, Seattle, and San Francisco offices, or projecting live chat in auditoriums to make the audience more engaged, I had been finding ways to foster human connection.</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn mv"><img alt="Engineers dressed up in costumes working in pairs at laptops. Ping pong balls are being throw at them from the crowd, while they try to focus on solving engineering problems on the computers." class="ea ky kz" src="https://miro.medium.com/max/1400/1*SIXAHtcfgeFICxZ6B11UwA.gif" width="700" height="394" /></div></div><figcaption class="mw ec go gm gn mx my bo b bp bq br">Pairs of engineers complete programming challenges, while being pelted with ping pong balls by the audience. This was just one of the many ridiculous segments of Nerds@, the engineering all-hands meeting I voluntarily produced for two years.</figcaption></figure><p id="6240" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Isolated at home with all these insights into event production, it’s not surprising that, while I watched our CEO, Brian Chesky, address thousands of employees via live stream, my attention drifted to the little number in the top right corner that showed how many other people were watching. These were my colleagues, my friends — some of the most talented people in the world. In our isolation, the only affordance that existed to capture our experience of watching this live stream together was an aggregate count. Instagram and YouTube allowed people to express themselves with emojis and comments during live broadcasts. Internally, we could not.</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn mz"><img alt="Photos of various employees at-home workstations. Some photos are laptops on a desk or coffee table. Others are projected onto a TV in a living room. Two of the photos feature pets in the background." class="ea ky kz" src="https://miro.medium.com/max/1400/1*t-XAOqp7jvfTDsFhQpSejQ.png" width="700" height="414" /></div></div><figcaption class="mw ec go gm gn mx my bo b bp bq br">A look at life before Awedience: everyone watching the same stream at home alone, with the exception of some furry pals.</figcaption></figure><p id="416d" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">An idea started to emerge: supplement the live stream with small, thumbnail-sized videos of all the viewers’ webcams and capture audience sentiment with emojis. It seemed simple enough.</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn na"><img alt="" class="ea ky kz" src="https://miro.medium.com/max/1400/1*rat4XcYbupe-pho3KM1wqw.jpeg" width="700" height="263" role="presentation" /></div></div><figcaption class="mw ec go gm gn mx my bo b bp bq br">The original sketch. Simple, right?</figcaption></figure><p id="b857" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Yet even this concept proved untenable. I had never built a fully real-time web application and would likely need months to figure out the webcam piece. I didn’t have months. I was only willing to dedicate a few days to see if this idea had merit.</p><p id="546c" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb"><a class="au mu" href="https://entrepreneurshandbook.co/how-airbnb-founders-sold-cereal-to-keep-their-dream-alive-d44223a9bdab" rel="noopener" target="_blank">Cereal Entrepreneurs</a> will attest that ideas are cheap — nobody copies ideas. Only <em class="mt">proven</em> ideas get copied, and this idea was definitely not proven. So I started to collaborate with some of my peers and <a class="au mu" href="https://www.linkedin.com/in/stepan-parunashvili-65698932" rel="noopener ugc nofollow" target="_blank">Stepan Parunashvili</a>, helped us get the ball rolling. “Punt on video for now,” he said, “start with profile pictures from our company directory. Firebase can handle all the real-time stuff, and we already have an internal authentication service. Boom!”</p><p id="4c96" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Stepan continued to offer his support with the initial infrastructure. We needed a name and thought about it for all of a minute. This product was all about the “audience” and “aww” is one sound an audience makes when they’re experiencing something together. Inspiring “awe” was also part of the motivation of this work so we decided on “Awedience.”</p><p id="f723" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Within a few hours, Stepan had the scaffolding complete. People could sign in and <a class="au mu" href="https://en.wikipedia.org/wiki/%22Hello,_World!%22_program" rel="noopener ugc nofollow" target="_blank">“hello world”</a> with other users. Our authentication service played gatekeeper, guaranteeing that the application would have access to an employee’s LDAP username after they signed in. That username enabled me to load a profile picture from our company directory. <a class="au mu" href="https://firebase.google.com/docs/database/" rel="noopener ugc nofollow" target="_blank">Firebase</a> served as the realtime database, and React sat right on top of it, bound (almost directly) to Firebase events. I could finally focus on my specialty, UI and UX. With an iframe embedded front and center, the UI naturally formed a U-shaped auditorium with virtual seats. When you clicked a seat, your picture would appear, and as you reacted, emojis would float out of your seat for everyone to see. You could also write short messages and they would pop out of your seat too, simulating shouts to the crowd.</p><p id="c19d" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">We had built something. Now, would anyone care?</p><h1 id="75bc" class="la lb iz bo lc ld le lf lg lh li lj lk kf ll kg lm ki ln kj lo kl lp km lq lr gb">Growth</h1><p id="fbe8" class="pw-post-body-paragraph ls lt iz lu b lv lw ka lx ly lz kd ma mb mc md me mf mg mh mi mj mk ml mm mn is gb">Attention is an incredibly valuable resource and there are lots of ways to get it. Attention is commonly bought. Attention can be diverted from other channels. It can even be stolen.</p><p id="efae" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">But attention can also be <em class="mt">earned</em>. When people <em class="mt">love</em> a product, it not only has true staying power but will grow organically. Therefore, one way to see if people love something new is to say nothing and observe.</p><p id="1da6" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">I invited close colleagues to try Awedience for an all-hands to see what would happen. Intuitively, they took a seat, started reacting, and used it throughout the meeting. The feedback was <a class="au mu" href="https://sive.rs/hellyeah" rel="noopener ugc nofollow" target="_blank">overwhelmingly positive</a>. Awedience didn’t do much but what it did do, it did well enough.</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn nb"><img alt="A screenshot of Airbnb’s Slack while Audience was being used for the first time. Some messages say “we’re sitting together,” “love this feature it’s so cute,” “this is the coolest,” and “I noticed you sat next to me.”" class="ea ky kz" src="https://miro.medium.com/max/1400/1*DPnN31xW0LgGXbbysUO_fw.png" width="700" height="406" /></div></div><figcaption class="mw ec go gm gn mx my bo b bp bq br">A glimpse into our company Slack: first reactions to Awedience at Airbnb.</figcaption></figure><p id="2bf6" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">At the time, Brian hosted all-company Q&amp;As weekly. I created a calendar event alongside the all-company one with an alternative URL and only invited people that had previously used Awedience. Within a few months, Awedience was so popular that it was offered as a secondary option in the official calendar invites.</p><p id="94f9" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">With the increased popularity, it was hard to support Awedience on nights and weekends. I asked my team for time to redirect my focus to Awedience for a few months, and they were supportive. The only request was that I figure out the product’s future by the end of that time and not leave things open-ended. Would it become a part of <a class="au mu" href="https://www.airbnb.com/s/experiences/online" rel="noopener ugc nofollow" target="_blank">Online Experiences</a>? Would it become a part of another team’s roadmap? We even speculated that it could be a completely new line of business.</p><p id="4eca" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">While it was tempting to keep adding functionality, resourcefulness was now the name of the game. Awedience was crashing during peak moments so performance was and still is the most important feature. Before we implemented throttling, we were binding reactions directly to our app state, which triggered a deluge of re-renders:</p><figure class="kp kq kr ks gy kt"><div class="m l ct"></div></figure><p id="61a2" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">A crude <em class="mt">batchedThrottle</em> function reduced renders when users mashed an emoji button:</p><figure class="kp kq kr ks gy kt"><div class="m l ct"></div></figure><p id="35b0" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Later, additional performance gains were found by detaching the React UI from Firebase real-time callbacks. Eventually, reactions would be managed natively without React at all:</p><figure class="kp kq kr ks gy kt"><div class="m l ct"></div></figure><p id="e243" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">There were more affordances I wanted to explore. At sporting events, attendees often hold up signs, flags, or even paint things on their bodies to spell out a message. Replicating this in Awedience was a huge hit. Rather than a profile picture, attendees could now pick a color, letter, or a portion of a graphic to display from their seat. People show up early and coordinate amongst themselves to spell out messages to represent their team, city, or the company. The result is magical. Awedience didn’t make it easy to tell your neighbor to change their seat picture. People were going out of their way to coordinate with one another. Connection between colleagues was happening organically and it was thrilling to see.</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn ne"><img alt="A screenshot of Awedience during an all-company meeting. Brian Chesky, CEO, is in the center and is surrounded by thousands of employees represented by tiny squares. The squares spell out words like “insurance,” “human,” or “trust.” There are also a few Airbnb logos being drawn out in 5x5 tiles." class="ea ky kz" src="https://miro.medium.com/max/1400/1*DsAVOD_cZgUwAP39X6qQZQ.jpeg" width="700" height="732" /></div></div><figcaption class="mw ec go gm gn mx my bo b bp bq br">An earlier version of Awedience where people were spelling words, representing their teams, cities, and the company. Or the CEO’s face.</figcaption></figure><p id="8b41" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Making sure there were enough seats for everyone was also a challenge. Too few seats and some people can’t participate; too many and the auditorium feels empty. To handle this, Awedience does something its skeuomorphic counterpart can’t do: adding seats as needed. This feature felt vital for a product built at a company so focused on belonging. Later, we would improve on this feature by increasing seating density such that almost 1,000 people are visible “above the fold.”</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn nf"><img alt="" class="ea ky kz" src="https://miro.medium.com/max/1400/1*kwLwo5MNpcT8HLk8xq3L7g.gif" width="700" height="556" role="presentation" /></div></div><figcaption class="mw ec go gm gn mx my bo b bp bq br">Adding rows of seats to the bottom worked for a long time but limited users from seeing everyone at once. It took the addition of virtual aisles to afford seats being added horizontally without compromising user-generated seat art.</figcaption></figure><p id="0a12" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Self-service features were also prioritized. Brian’s staff immediately wanted to know what content was spurring engagement — a question they hadn’t been able to answer since going remote. Cumulative data from each event was piped into a graphing library for quick and dirty analytics. Similarly, our video production team wanted to be able to create and edit auditoriums without relying on me so that self-service tooling came as well.</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn ng"><img alt="A stacked line graph of emojis over time. One line, applause, for example goes up to over 300 reactions for a moment. Later, hearts, spike to nearly 150. Each emoji has its own usage graphed on the chart." class="ea ky kz" src="https://miro.medium.com/max/1400/1*ceUehJwN6__G7Q_OIQoZrA.png" width="700" height="477" /></div></div><figcaption class="mw ec go gm gn mx my bo b bp bq br">Awedience helps presenters understand exactly which parts of their presentation landed for their audience.</figcaption></figure><p id="19c3" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Later, during a hackathon, we even created applause sound effects that naturally scale up from the sound of a few hands clapping to an uproar based on audience engagement.</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><a href="https://video.airbnb.com/media/t/1_o3egj5jk"><div class="gm gn nh"><img alt="A video demo of Awedio, in which you can see Brian Chesky being interviewed on CBS while members of the audience applause around him. In addition to seeing the applause emojis rise from the audience, you can also hear the applause generated by the software." class="ea ky kz" src="https://miro.medium.com/max/1400/1*N_NlPojNdwUIK83bev6Rug.jpeg" width="700" height="453" /></div></a><figcaption class="mw ec go gm gn mx my bo b bp bq br">Awedio allows you to hear the audience’s applause reactions.</figcaption></figure><h1 id="2237" class="la lb iz bo lc ld le lf lg lh li lj lk kf ll kg lm ki ln kj lo kl lp km lq lr gb">Moments</h1><p id="6ca3" class="pw-post-body-paragraph ls lt iz lu b lv lw ka lx ly lz kd ma mb mc md me mf mg mh mi mj mk ml mm mn is gb">Awedience made Airbnb feel like Airbnb again. There’s now a place where you can see everyone and feel connected to them. It’s become home to celebratory moments and a place where we can sit alongside one another during somber ones.</p><p id="3d37" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">When Airbnb announced a cut back to our workforce, there was an all-hands scheduled to honor and appreciate the employees who were leaving. However, with VPN access cut to roughly 2,000 soon-to-be alumni, Awedience was suddenly only accessible to the spared employees. Resident security guru, <a class="au mu" href="https://www.linkedin.com/in/keeleysam" rel="noopener ugc nofollow" target="_blank">Sam Keeley</a>, and I committed to making Awedience accessible outside of VPN and almost overnight switched authentication to Google IAP. When the founders addressed the company, they invited a standing ovation for our departing peers and Awedience obliged. It’s hard to imagine what kind of impersonal and solitary send-off we would have had without Awedience.</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn ni"><img alt="" class="ea ky kz" src="https://miro.medium.com/max/1400/1*znVI-1UvXbpPEn9WClZzJw.gif" width="700" height="419" role="presentation" /></div></div><figcaption class="mw ec go gm gn mx my bo b bp bq br">Hundreds of employees joined our founders in a virtual standing ovation for the members of our team that were let go as a result of the pandemic cut backs.</figcaption></figure><p id="341a" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">In May, in the wake of social action around George Floyd’s murder, the company met to address the Black Lives Matter movement. At the end of this meeting, Brian invited the company to take an 8 minute and 46 second moment of silence.</p><figure class="kp kq kr ks gy kt gm gn paragraph-image"><div role="button" tabindex="0" class="ku kv ct kw ea kx"><div class="gm gn nj"><img alt="" class="ea ky kz" src="https://miro.medium.com/max/1400/1*-7x_Yi1B3DiKxQTq6yv_aA.gif" width="700" height="465" role="presentation" /></div></div><figcaption class="mw ec go gm gn mx my bo b bp bq br">Employees join in a somber yet moving virtual moment of silence for George Floyd.</figcaption></figure><h1 id="f67f" class="la lb iz bo lc ld le lf lg lh li lj lk kf ll kg lm ki ln kj lo kl lp km lq lr gb">Conclusion</h1><p id="1b0a" class="pw-post-body-paragraph ls lt iz lu b lv lw ka lx ly lz kd ma mb mc md me mf mg mh mi mj mk ml mm mn is gb">At Airbnb, Awedience is here to stay and now receives ongoing support and maintenance. In collaboration with our Employee Experience team, we found a home where it would make sense long term. In fact, if this is the kind of work you find interesting, you may even consider joining our team to help us build internal tools to foster connection — <a class="au mu" href="https://careers.airbnb.com/positions/" rel="noopener ugc nofollow" target="_blank">we’re hiring</a>!</p><p id="f59c" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">I feel fortunate to work for a company that creates space to bring these types of projects to life. Airbnb is an inspiring place — the combination and culmination of a rigorous entrepreneurial spirit and an ongoing commitment to outdo the status quo. That’s the type of environment you need for ideas like Awedience to flourish.</p><p id="b370" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">Awedience is more than just a triumph of passion and creativity. The spark of the idea was just that: a spark. In the words of <a class="au mu" href="https://twitter.com/richardbranson/status/264067714266587136" rel="noopener ugc nofollow" target="_blank">Richard Branson</a>, “opportunities are like the buses — there’s always one coming.” Without the help and support of many amazingly talented colleagues, there would literally be nothing to write about.</p><p id="2d22" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">What makes Awedience awesome is the people. Big ideas are rarely the consequence of one person’s ideas or effort. It takes a lot of people to do incredible things.</p><h1 id="ac3d" class="la lb iz bo lc ld le lf lg lh li lj lk kf ll kg lm ki ln kj lo kl lp km lq lr gb">Acknowledgements</h1><p id="d988" class="pw-post-body-paragraph ls lt iz lu b lv lw ka lx ly lz kd ma mb mc md me mf mg mh mi mj mk ml mm mn is gb">To Stepan Parunashvili for fueling the fire and bootstrapping the infrastructure that got Awedience going. Without you, it would not have been possible. Thank you.</p><p id="cb18" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Sam Keeley for enabling and evolving Awedience access for the entire company.</p><p id="5de6" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Joe Gebbia for creating some air space for Awedience to grow and evolve.</p><p id="0b98" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Byoung Bae, Allison Frelinger, Darrick Brown, and Judd Antin of my former team for taking a gamble on Awedience with me.</p><p id="0ad4" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Liz Kleinman and Beth Axelrod for creating a role for me to continue this work.</p><p id="c377" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Shawdi Ilbagian Hahn, Dave O’Neill, Kylie McQuain, Kelly Bechtel, Kate Walsh, Benny Etienne, Carrie Kissell, Alyce Thompson, John Lawrence, and Samantha Eaton for your collaboration and partnership in keeping the company engaged and connected.</p><p id="0c01" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Cory Boldt, Steven McNellie, Garrett McGrath, Alex Lacayo, John Espey, and Scott Ethersmith for your help and creativity on the technical productions.</p><p id="75fb" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Jenna Cushner, Ortal Yahdav, Lucille Hua, Christian Williams, Shawn Terasaki, Brian Wallerstein, Ben Muschol, Mike Fowler, Jason Goodman, Caty Kobe, Joe Lencioni, Nicolas Haunold, Christian Baker, Alan Sun, and Jacqui Watts for your early contributions and feedback.</p><p id="c026" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Kevin Swint, Danielle Zloto, Christine Berry, Federica Petruccio, and Consuelo Hernandez for going above and beyond to try Awedience with Online Experiences and the powerful insights that were created as a result.</p><p id="9fe4" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Nicholas Roth, Izzy Rattner, Jonathan Lieberman, Stephen Gikow, Steve Flanders, Lonya Breitel, Alan Shum, Brian Savage, Veronica Mariano, Allie Hastings, Alica Del Valle, Rajiv Patel, and Emily Bullis for your legal support in protecting Awedience’s intellectual property and making external partnerships possible.</p><p id="a77a" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Sarah Baker for always rallying people together to create seat artwork.</p><p id="04e2" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Gaurav Mathur, Hope Eckert, Sean Abraham, Jessie Li, Vaithiyanathan Sundaram, Andy Yasutake, Virginia Vickery, Jonathan Rahmani, Andrew Pariser, Sunakshi Kapoor, Diane Ko, Biki Berry, Francisco Diaz, Erik Ritter, Tony Gamboa, Mohsen Azimi, Bruce Paul, Omari Dixon, Sonia Anderson, CJ Cipriano, Chihwei Yeh, Arie Van Antwerp, Victor De Souza, Sam Shadwell, Deanna Bjorkquist, Jenna Cushner, Richard Kirk, Jake Silver, Alex Rosenblatt, David He, LA Logan, Ryan Booth, Pistachio Matt, Melanie Cebula, Brian Morearty, and Victor Caruso for your participation and support!</p><p id="f9b8" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Stephanie Wei, Micah Roumasset, Ryland Harris, Waylon Janowiak, and Ben Arnon for your willingness to try Awedience outside of Airbnb.</p><p id="a853" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Jerry Chabolla, Nicholas Schell, Ryan Jespersen, Sergio Garcia Murillo, Wes Dagget, and the entire team at Millicast for enabling real-time streaming.</p><p id="675c" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">To Brett Bukowski, Cara Moyer, Nicki Williams, Dylan Hurd, and Lauren Mackevich for encouragement and support in writing this blog post.</p><p id="7203" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb">And lastly to Danee Chavez for powering the light bulb. ?</p><p id="787d" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb"><strong class="lu ja">Interested in working at Airbnb? Check out these open roles:</strong></p><p id="397c" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb"><a class="au mu" href="https://careers.airbnb.com/positions/3714489/" rel="noopener ugc nofollow" target="_blank">Senior Software Engineer, Airfam Products</a></p><p id="77c4" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb"><a class="au mu" href="https://careers.airbnb.com/positions/3955056/" rel="noopener ugc nofollow" target="_blank">Staff Technical Program Manager, Insurance Platform</a></p><p id="07af" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb"><a class="au mu" href="https://careers.airbnb.com/positions/3988445/" rel="noopener ugc nofollow" target="_blank">Staff Automation Engineer, BizTech Global Ops</a></p><p id="6deb" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb"><a class="au mu" href="https://careers.airbnb.com/positions/4003012/" rel="noopener ugc nofollow" target="_blank">Operations Engineer</a></p><p id="ee4f" class="pw-post-body-paragraph ls lt iz lu b lv mo ka lx ly mp kd ma mb mq md me mf mr mh mi mj ms ml mm mn is gb"><em class="mt">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/hacking-human-connection-the-story-of-awedience-ebf66ee6af0e</link>
      <guid>https://medium.com/airbnb-engineering/hacking-human-connection-the-story-of-awedience-ebf66ee6af0e</guid>
      <pubDate>Tue, 05 Apr 2022 19:29:00 +0200</pubDate>
    </item>
    <item>
      <title><![CDATA[Hacking Human Connection: the Story of Awedience]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gp gq gr gs gt gu gv gw gx gy l"><div class="o gz u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@avand?source=post_page-----da90af2b1d0-----------------------------------"><div class="l dq"><img alt="Avand Amiri" class="l ci fn ha hb" src="https://miro.medium.com/fit/c/96/96/1*uRf1xD16DZMiCJST8Svcyw.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gb"><div class="hc o hd"><div><div class="cj" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@avand?source=post_page-----da90af2b1d0-----------------------------------">Avand Amiri</a></div></div><div class="he hf hg hh hi d"></div></div><div class="o ao hu"><p class="pw-published-date bn b bo bp co">Mar 22</p><div class="hv cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">11 min read</div></div></div></div><div class="o ao"><div class="cl hw hx hy hz d"><div></div></div></div></div></div></header><section><div><div class="ik il im in io"><div class=""><p id="f250" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">by <a class="au km" href="https://www.linkedin.com/in/avand/" rel="noopener ugc nofollow" target="_blank">Avand Amiri</a></p><p id="dc68" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">How a home-grown product helps Airbnb employees feel more connected during solitary times</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn kn"><img alt="A screenshot of Awedience during an all-company meeting. Brian Chesky, CEO, is in the center and is surrounded by thousands of employees represented by tiny squares. Each square is either a picture of the employee, a color, or a letter that people have arranged to spell words or draw shapes. Emojis are captured emerging from some of the seats." class="cf kx ky" src="https://miro.medium.com/max/1400/1*wYiW3cB5YfO_oLC9CVHW3A.jpeg" width="700" height="705" /></div></div></figure><h1 id="27ee" class="kz la ir bn lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv lw gb">Introduction</h1><p id="9988" class="pw-post-body-paragraph jo jp ir jq b jr lx jt ju jv ly jx jy jz lz kb kc kd ma kf kg kh mb kj kk kl ik gb">This is the story of how I made the employees at Airbnb feel connected during a time they had never felt more apart. It’s the story of an idea that turned into a core part of how Airbnb operates. In this post, you’ll learn all the critical ingredients that helped me take an idea to wide-spread adoption, and hopefully you’ll be inspired to do the same!</p><p id="3780" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">When you walk through the doors of an Airbnb office, you feel an energy that’s both inspiring and intimidating. After more than five years with the company, I explain this duality as Airbnb being both incredibly entrepreneurial and aspirational.</p><p id="3311" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Airbnb company meetings are no different. Brian Chesky and his team keep our all-hands meetings exciting. I know what you’re thinking: “exciting meetings?!” But in all seriousness our all-hands are not just informative, they’re spectacular. Whether it’s drinking a smoothie of dehydrated bugs in solidarity with Engineering or eating spicy chicken wings in a Hot Ones-style Q&amp;A, our meetings are informative, energizing, and engaging. In somber moments, they’re human and heartfelt.</p><p id="c590" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Obviously, the pandemic changed that. Feeling connected to the presenter, feeling connected to our peers, or feeling the presenter’s connection to the audience all vanished. Instead, we each separately watched the presenter streaming through a 16:9 rectangle. Other people were watching concurrently (one could assume based on the invite) but it couldn’t be <em class="mc">felt</em>. Inspired, I set out to solve this.</p><p id="5402" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">“<a class="au km" href="https://en.wikipedia.org/wiki/The_Six_Million_Dollar_Man" rel="noopener ugc nofollow" target="_blank">We have the technology</a>,” I thought to myself, “why couldn’t we see and interact with everyone watching live?”</p><h1 id="9449" class="kz la ir bn lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv lw gb"><strong class="ba">Inspiration</strong></h1><p id="03fe" class="pw-post-body-paragraph jo jp ir jq b jr lx jt ju jv ly jx jy jz lz kb kc kd ma kf kg kh mb kj kk kl ik gb">It’s impossible to know under which circumstances inspiration will find us, and in hindsight it seems perfectly planned.</p><p id="25b3" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">In March of 2020, as the spread of COVID-19 forced us to shelter in place, what it meant to feel connected to the world was redefined by emojis over video. Video chat helps us stay in communication, but it’s <a class="au km" href="https://www.youtube.com/watch?v=DYu_bGbZiiQ" rel="noopener ugc nofollow" target="_blank">comically clunky</a>, dry, and lifeless. If these technologies feel contrived it’s because they are. We all know what it actually feels like to feel connected, and it’s not that.</p><p id="e040" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">For almost two years before the pandemic hit, I had been voluntarily producing Airbnb’s quarterly engineering all-hands. Unbeknownst to me, I was actually studying the question, “how does one make people feel more connected?” Whether it meant hosting an orca-themed relay race across our Portland, Seattle, and San Francisco offices, or projecting live chat in auditoriums to make the audience more engaged, I had been studying how to foster human connection.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn md"><img alt="Engineers dressed up in costumes working in pairs at laptops. Ping pong balls are being throw at them from the crowd, while they try to focus on solving engineering problems on the computers." class="cf kx ky" src="https://miro.medium.com/max/1400/1*SIXAHtcfgeFICxZ6B11UwA.gif" width="700" height="394" /></div></div><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">Pairs of engineers complete programming challenges, while being pelted with ping pong balls by the audience. This was just one of the many ridiculous segments of Nerds@, the engineering all-hands meeting I voluntarily produced for two years.</em></figcaption></figure><p id="1d65" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Under these conditions, it’s not surprising that, while I watched our CEO, Brian Chesky, address thousands of employees via live stream, my attention drifted to the little number in the top right corner that showed how many other people were watching. These were my colleagues, my friends — some of the most talented people in the world. In our isolation, the only affordance that existed to capture our experience of watching this live stream together was an aggregate count. Instagram and YouTube allowed people to express themselves with emojis and comments during live broadcasts. Internally, we could not.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mi"><img alt="Photos of various employees at-home workstations. Some photos are laptops on a desk or coffee table. Others are projected onto a TV in a living room. Two of the photos feature pets in the background." class="cf kx ky" src="https://miro.medium.com/max/1400/1*t-XAOqp7jvfTDsFhQpSejQ.png" width="700" height="414" /></div></div><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">A look at life before Awedience: everyone watching the same stream at home alone, with the exception of some furry pals.</em></figcaption></figure><p id="a84b" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">I drew a sketch: supplement the live stream with small, thumbnail-sized videos of all the viewers’ webcams and capture the audience sentiment with emojis. It seemed simple enough.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mj"><img alt="A sketch of the original idea. There’s a video of the speaker on the left. In the middle, there’s a bunch of tiles representing people watching. A heart emoji and “haha” appear as captions over the tiles. On the left, there’s a chat window and some buttons to react to the video with emojis." class="cf kx ky" src="https://miro.medium.com/max/1400/1*rat4XcYbupe-pho3KM1wqw.jpeg" width="700" height="263" /></div></div><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">The original sketch. Simple, right?</em></figcaption></figure><p id="5202" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Yet even this concept proved untenable. I had never built a fully real-time web application and would likely need months to figure out the webcam piece. I didn’t have months. I was only willing to dedicate a few days to see if this idea had merit.</p><p id="05b3" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Fortunately, I had learned from many years of my own entrepreneurship that ideas are cheap. Nobody copies ideas. They copy <em class="mc">proven</em> ideas, and this idea was definitely not proven. So I openly started talking to peers at Airbnb about the idea and one guy, <a class="au km" href="https://www.linkedin.com/in/stepan-parunashvili-65698932" rel="noopener ugc nofollow" target="_blank">Stepan Parunashvili</a>, got the ball rolling. “Punt on video for now,” he said, “start with profile pictures from our company directory. Firebase can handle all the real-time stuff, and we already have an internal authentication service. Boom!”</p><p id="3e45" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Still daunted by all the infrastructure, Stepan helped me get started. He needed a name so I thought about it for all of a minute. This product was all about the “audience” and “aww” is one sound an audience makes when they’re experiencing something together. Inspiring “awe” was also kind of the point of this work so “Awedience” it became.</p><p id="c83d" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">In a few hours, Stepan had the scaffolding. People could sign in and <a class="au km" href="https://en.wikipedia.org/wiki/%22Hello,_World!%22_program" rel="noopener ugc nofollow" target="_blank">“hello world”</a> with other users. Our authentication service played gatekeeper, guaranteeing that the application would have access to an employee’s LDAP username after they signed in. That username enabled me to load a profile picture from our company directory. <a class="au km" href="https://firebase.google.com/docs/database/" rel="noopener ugc nofollow" target="_blank">Firebase</a> served as the realtime database, and React sat right on top of it, bound (almost directly) to Firebase events. I could finally focus on my specialty, UI and UX. With an iframe embedded front and center, the UI naturally formed a U-shaped auditorium with virtual seats. When you clicked a seat, your picture would appear, and as you reacted, emojis would float out of your seat for everyone to see. You could also write short messages and they would pop out of your seat too, simulating shouts to the crowd.</p><p id="3699" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">We had built something. Now, would anyone care?</p><h1 id="4218" class="kz la ir bn lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv lw gb"><strong class="ba">Growth</strong></h1><p id="f434" class="pw-post-body-paragraph jo jp ir jq b jr lx jt ju jv ly jx jy jz lz kb kc kd ma kf kg kh mb kj kk kl ik gb">Attention is the most valuable resource I know of and there are lots of ways to get it. Attention is commonly bought. Attention can be diverted from other channels. It can even be stolen.</p><p id="895c" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">But attention can also be <em class="mc">earned</em>. When people <em class="mc">love</em> a product it not only has true staying power but will grow organically. Therefore, I believe the best way to see if people love something new, is to say nothing and observe.</p><p id="2d0a" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">I invited some close colleagues to try Awedience for an all-hands and see what would happen. Intuitively, they took a seat, started reacting, and used it throughout the meeting. I was <a class="au km" href="https://sive.rs/hellyeah" rel="noopener ugc nofollow" target="_blank">inundated</a> with positive feedback. Awedience didn’t do much but what it did do, it did well enough. Often in a race to release, minimally viable products are either broken or confusing and it becomes impossible to differentiate why it didn’t work: bad idea or bad execution?</p><p id="cce1" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">At the time, Brian hosted all-company Q&amp;As weekly. I created a calendar event alongside the all-company one with an alternative URL and only invited people that had used Awedience already. Within a few months, Awedience was so popular that it was offered as a secondary option in the official calendar invites.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mk"><img alt="A screenshot of Airbnb’s Slack while Audience was being used for the first time. Some messages say “we’re sitting together,” “love this feature it’s so cute,” “this is the coolest,” and “I noticed you sat next to me.”" class="cf kx ky" src="https://miro.medium.com/max/1400/1*DPnN31xW0LgGXbbysUO_fw.png" width="700" height="406" /></div></div><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">A glimpse into our company Slack: first reactions to Awedience at Airbnb.</em></figcaption></figure><p id="3cdb" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">With the increased popularity, it was hard to support Awedience on nights and weekends. I asked my team for time to redirect my focus to Awedience for a few months, and they were supportive. The only request was that I figure out the product’s future by the end of that time and not leave things open-ended. Would it become a part of <a class="au km" href="https://www.airbnb.com/s/experiences/online" rel="noopener ugc nofollow" target="_blank">Online Experiences</a>? Would it become a part of another team’s roadmap? We even speculated that it could be a completely new line of business!</p><p id="66ca" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">With the clock ticking, it was tempting to keep adding functionality but I had to be resourceful. Awedience was crashing during peak moments so performance was and still is the most important feature. Before we implemented throttling, we were binding reactions directly to our app state, which triggered a deluge of re-renders:</p><figure class="ko kp kq kr gy ks"><div class="m l dq"></div></figure><p id="66ac" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">A crude <em class="mc">batchedThrottle</em> function reduced renders when users mashed an emoji button:</p><figure class="ko kp kq kr gy ks"><div class="m l dq"></div></figure><p id="70bd" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Later, additional performance gains were found by detaching the React UI from Firebase real-time callbacks. Eventually, reactions would be managed natively without React at all:</p><figure class="ko kp kq kr gy ks"><div class="m l dq"></div></figure><p id="8a58" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">There were more affordances I wanted to explore. At sporting events, attendees often hold up signs, flags, or even paint things on their bodies to spell out a message. Replicating this in Awedience was a huge hit. Rather than a profile picture, attendees could now pick a color, letter, or a portion of a graphic to display from their seat. People show up early and coordinate amongst themselves to spell out messages to represent their team, city, or the company. The result is magical. Awedience didn’t make it easy to tell your neighbor to change their seat picture. People were going out of their way to coordinate with one another. Human connection, check!</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mn"><img alt="A screenshot of Awedience during an all-company meeting. Brian Chesky, CEO, is in the center and is surrounded by thousands of employees represented by tiny squares. The squares spell out words like “insurance,” “human,” or “trust.” There are also a few Airbnb logos being drawn out in 5x5 tiles." class="cf kx ky" src="https://miro.medium.com/max/1400/1*DsAVOD_cZgUwAP39X6qQZQ.jpeg" width="700" height="732" /></div></div><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">An earlier version of Awedience where people were spelling words, representing their teams, cities, and the company. Or the CEO’s face.</em></figcaption></figure><p id="55ac" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Making sure there were enough seats for everyone was also a challenge. Too few seats and some people can’t participate; too many and the auditorium feels empty. For this, I gave Awedience a trick its skeuomorphic counterpart can’t do: adding seats as needed. This feature felt vital for a product built at a company so focused on belonging. Later, we would improve on this feature by increasing seating density such that almost 1,000 people are visible “above the fold.”</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mo"><img alt="A video of Brian Chesky being interviewed on CBS plays in the middle. Around the video are viewers filling in. As seats run out, more seats are added throughout the virtual auditorium, creating space for more people. As seats are added, the existing seats become a bit smaller." class="cf kx ky" src="https://miro.medium.com/max/1400/1*kwLwo5MNpcT8HLk8xq3L7g.gif" width="700" height="556" /></div></div><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">Adding rows of seats to the bottom worked for a long time but limited users from seeing everyone at once. It took the addition of virtual aisles to afford seats being added horizontally without compromising user-generated seat art.</em></figcaption></figure><p id="c0f2" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">I also prioritized self-service features. Brian’s staff immediately wanted to know what content was spurring engagement — a question they hadn’t been able to answer since going remote. I piped cumulative data from an event into a graphing library for quick and dirty analytics. Similarly, our video production team wanted to be able to create and edit auditoriums without relying on me so that self-service tooling came as well.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mp"><img alt="A stacked line graph of emojis over time. One line, applause, for example goes up to over 300 reactions for a moment. Later, hearts, spike to nearly 150. Each emoji has its own usage graphed on the chart." class="cf kx ky" src="https://miro.medium.com/max/1400/1*ceUehJwN6__G7Q_OIQoZrA.png" width="700" height="477" /></div></div><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">Awedience helps presenters understand exactly which parts of their presentation landed for their audience.</em></figcaption></figure><p id="3f35" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Later, during a hackathon, we even created applause sound effects that naturally scale up from the sound of a few hands clapping to an uproar based on audience engagement.</p><figure class="ko kp kq kr gy ks"><div class="m l dq"><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">Awedio allows you to hear the audience’s applause reactions.</em></figcaption></div></figure><h1 id="c798" class="kz la ir bn lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv lw gb">Moments</h1><p id="fa4e" class="pw-post-body-paragraph jo jp ir jq b jr lx jt ju jv ly jx jy jz lz kb kc kd ma kf kg kh mb kj kk kl ik gb">Awedience made Airbnb feel like Airbnb again. There’s now a place where you can see everyone and feel connected to them. It’s become home to celebratory moments and a place where we can be there alongside one another during somber moments.</p><p id="bbad" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">When Airbnb announced a cut back to our workforce, there was an all-hands scheduled to honor and appreciate the employees who were leaving. However, with VPN access cut to roughly 2,000 soon-to-be alumni, Awedience was suddenly only accessible to the spared employees. Resident security guru, <a class="au km" href="https://www.linkedin.com/in/keeleysam" rel="noopener ugc nofollow" target="_blank">Sam Keeley</a>, and I committed to making Awedience accessible outside of VPN and almost overnight switched authentication to Google IAP. When the founders addressed the company, they invited a standing ovation for our departing peers and Awedience obliged. It’s hard to imagine what kind of impersonal and solitary send-off we would have had without Awedience.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mr"><img alt="A sequence of screenshots taken from the product during a virtual standing ovation, where the founders are seen in the video physically standing and applauding and applause emojis are bursting from viewers seats." class="cf kx ky" src="https://miro.medium.com/max/1400/1*znVI-1UvXbpPEn9WClZzJw.gif" width="700" height="419" /></div></div><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">Hundreds of employees joined our founders in a virtual standing ovation for the members of our team that were let go as a result of the pandemic cut backs.</em></figcaption></figure><p id="cc7c" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">In May, after the death of George Floyd, the company met to address the Black Lives Matter movement. At the end of this meeting, Brian invited the company to take an 8 minute and 46 second moment of silence.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn ms"><img alt="A sequence of screenshots taken from the product during a moment of silence for George Floyd displaying messages like “Black Lives Matter” and “I can’t breathe” with hundreds of people representing their seats with a black square." class="cf kx ky" src="https://miro.medium.com/max/1400/1*-7x_Yi1B3DiKxQTq6yv_aA.gif" width="700" height="465" /></div></div><figcaption class="me bm go gm gn mf mg bn b bo bp co"><em class="mh">Employees join in a dark yet moving virtual moment of silence for George Floyd.</em></figcaption></figure><h1 id="db7f" class="kz la ir bn lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv lw gb"><strong class="ba">Conclusion</strong></h1><p id="13cc" class="pw-post-body-paragraph jo jp ir jq b jr lx jt ju jv ly jx jy jz lz kb kc kd ma kf kg kh mb kj kk kl ik gb">At Airbnb, Awedience is here to stay and now receives ongoing support and maintenance. I worked with our Employee Experience team to find a home where it would make sense long term and then joined that team. In fact, if this is the kind of work you find interesting, you may even consider joining our team to help us build internal tools to foster connection — <a class="au km" href="https://careers.airbnb.com/positions/3714489/" rel="noopener ugc nofollow" target="_blank">we’re hiring</a>!</p><p id="f0b4" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">I feel fortunate to work for a company that creates space for things like this to come to life. Airbnb is an inspiring place — the combination and culmination of a rigorous entrepreneurial spirit and an ongoing commitment to outdo the status quo. That’s the type of environment you need for ideas like Awedience to come to life.</p><p id="68d7" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Awedience is more than just a triumph of passion and creativity. The spark of the idea was just that: a spark. In the words of Richard Branson, “<a class="au km" href="https://twitter.com/richardbranson/status/264067714266587136?s=20&amp;t=Bs83Ylrn7ZYY0JTGoDUoUw" rel="noopener ugc nofollow" target="_blank">opportunities are like the buses — there’s always one coming</a>.” Without the help and support of many amazingly talented colleagues, there would literally be nothing to write about.</p><p id="f50d" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">What makes Awedience awesome is the people. Big ideas are rarely the consequence of one person’s ideas or effort. It takes a lot of people to do incredible things.</p><p id="1714" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb"><strong class="jq is">Interested in working with Avand at Airbnb? Check out these open roles:</strong><br /><a class="au km" href="https://careers.airbnb.com/positions/3714489/" rel="noopener ugc nofollow" target="_blank">Senior Software Engineer, Airfam Products</a><br /><a class="au km" href="https://careers.airbnb.com/positions/3955056/" rel="noopener ugc nofollow" target="_blank">Staff Technical Program Manager, Insurance Platform</a><br /><a class="au km" href="https://careers.airbnb.com/positions/3988445/" rel="noopener ugc nofollow" target="_blank">Staff Automation Engineer, BizTech Global Ops</a><br /><a class="au km" href="https://careers.airbnb.com/positions/4003012/" rel="noopener ugc nofollow" target="_blank">Operations Engineer</a></p><h1 id="2dba" class="kz la ir bn lb lc ld le lf lg lh li lj lk ll lm ln lo lp lq lr ls lt lu lv lw gb"><strong class="ba">Acknowledgements</strong></h1><p id="ae49" class="pw-post-body-paragraph jo jp ir jq b jr lx jt ju jv ly jx jy jz lz kb kc kd ma kf kg kh mb kj kk kl ik gb">To Stepan Parunashvili for fueling the fire and bootstrapping the infrastructure that got Awedience going. Without you, it would not have been possible. Thank you.</p><p id="a78a" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Sam Keeley for enabling and evolving Awedience access for the entire company.</p><p id="4a9d" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Joe Gebbia for creating some air space for Awedience to grow and evolve.</p><p id="dff6" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Byoung Bae, Allison Frelinger, Darrick Brown, and Judd Antin of my former team for taking a gamble on Awedience with me.</p><p id="f248" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Liz Kleinman and Beth Axelrod for creating a role for me to continue this work.</p><p id="024d" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Shawdi Ilbagian Hahn, Dave O’Neill, Kylie McQuain, Kelly Bechtel, Kate Walsh, Benny Etienne, Carrie Kissell, Alyce Thompson, John Lawrence, and Samantha Eaton for your collaboration and partnership in keeping the company engaged and connected.</p><p id="25a4" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Cory Boldt, Steven McNellie, Garrett McGrath, Alex Lacayo, John Espey, and Scott Ethersmith for your help and creativity on the technical productions.</p><p id="8c0d" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Jenna Cushner, Ortal Yahdav, Lucille Hua, Christian Williams, Shawn Terasaki, Brian Wallerstein, Ben Muschol, Mike Fowler, Jason Goodman, Caty Kobe, Joe Lencioni, Nicolas Haunold, Christian Baker, Alan Sun, and Jacqui Watts for your early contributions and feedback.</p><p id="2cd0" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Kevin Swint, Danielle Zloto, Christine Berry, Federica Petruccio, and Consuelo Hernandez for going above and beyond to try Awedience with Online Experiences and the powerful insights that were created as a result.</p><p id="23d8" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Nicholas Roth, Izzy Rattner, Jonathan Lieberman, Stephen Gikow, Steve Flanders, Lonya Breitel, Alan Shum, Brian Savage, Veronica Mariano, Allie Hastings, Alica Del Valle, Rajiv Patel, and Emily Bullis for your legal support in protecting Awedience’s intellectual property and making external partnerships possible.</p><p id="029c" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Sarah Baker for always rallying people together to create seat artwork.</p><p id="8ec0" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Gaurav Mathur, Hope Eckert, Sean Abraham, Jessie Li, Vaithiyanathan Sundaram, Andy Yasutake, Virginia Vickery, Jonathan Rahmani, Andrew Pariser, Sunakshi Kapoor, Diane Ko, Biki Berry, Francisco Diaz, Erik Ritter, Tony Gamboa, Mohsen Azimi, Bruce Paul, Omari Dixon, Sonia Anderson, CJ Cipriano, Chihwei Yeh, Arie Van Antwerp, Victor De Souza, Sam Shadwell, Deanna Bjorkquist, Jenna Cushner, Richard Kirk, Jake Silver, Alex Rosenblatt, David He, LA Logan, Ryan Booth, Pistachio Matt, Melanie Cebula, Brian Morearty, and Victor Caruso for your participation and support!</p><p id="33c1" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Stephanie Wei, Micah Roumasset, Ryland Harris, Waylon Janowiak, and Ben Arnon for your willingness to try Awedience outside of Airbnb.</p><p id="8327" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Jerry Chabolla, Nicholas Schell, Ryan Jespersen, Sergio Garcia Murillo, Wes Dagget, and the entire team at Millicast for enabling real-time streaming.</p><p id="20e4" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">To Brett Bukowski, Cara Moyer, and Lauren Mackevich for encouragement and support in writing this blog post.</p><p id="c33d" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">And lastly to Danee Chavez for powering the light bulb. ?</p><p id="9a74" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb"><em class="mc">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/hacking-human-connection-the-story-of-awedience-da90af2b1d0</link>
      <guid>https://medium.com/airbnb-engineering/hacking-human-connection-the-story-of-awedience-da90af2b1d0</guid>
      <pubDate>Tue, 22 Mar 2022 22:00:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Measuring Latency Overhead with Own Time]]></title>
      <description><![CDATA[<header class="pw-post-byline-header gp gq gr gs gt gu gv gw gx gy l"><div class="o gz u"><div class="o"><div class="fl l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@oneill?source=post_page-----f4373f586ca-----------------------------------"><div class="l dq"><img alt="Jimmy O’Neill" class="l ci fn ha hb" src="https://miro.medium.com/fit/c/96/96/1*YY9H-aABYFitBf9mlfCIvg.png" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author bn b do dp gb"><div class="hc o hd"><div><div class="cj" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@oneill?source=post_page-----f4373f586ca-----------------------------------">Jimmy O’Neill</a></div></div><div class="he hf hg hh hi d"></div></div><div class="o ao hu"><p class="pw-published-date bn b bo bp co">Mar 21</p><div class="hv cj" aria-hidden="true">·</div><div class="pw-reading-time bn b bo bp co">8 min read</div></div></div></div><div class="o ao"><div class="cl hw hx hy hz d"><div></div></div></div></div></div></header><section><div><div class="ik il im in io"><div class=""><p id="1083" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">by: <a class="au km" href="https://www.linkedin.com/in/jimmyoneill" rel="noopener ugc nofollow" target="_blank">Jimmy O’Neill</a></p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn kn"><img alt="" class="cf kx ky" src="https://miro.medium.com/max/1400/1*JQ6ZtmHuSu-86FXV1i7Lmg.jpeg" width="700" height="467" role="presentation" /></div></div></figure><h2 id="5406" class="kz la ir bn lb lc ld le lf lg lh li lj jz lk ll lm kd ln lo lp kh lq lr ls lt gb">A new metric to quantify the latency overhead of our Viaduct framework</h2><p id="c4be" class="pw-post-body-paragraph jo jp ir jq b jr lu jt ju jv lv jx jy jz lw kb kc kd lx kf kg kh ly kj kk kl ik gb"><a class="au km" rel="noopener" href="https://medium.com/airbnb-engineering/taming-service-oriented-architecture-using-a-data-oriented-service-mesh-da771a841344">Viaduct</a>, a GraphQL-based data-oriented service mesh, is Airbnb’s paved road solution for fetching internal data and serving public-facing API requests. As a unified data access layer, the Viaduct framework handles high throughput and is capable of dynamically routing to hundreds of downstream destinations when executing arbitrary GraphQL queries.</p><h1 id="626c" class="lz la ir bn lb ma mb mc lf md me mf lj mg mh mi lm mj mk ml lp mm mn mo ls mp gb">Performance Challenges in Viaduct</h1><p id="5720" class="pw-post-body-paragraph jo jp ir jq b jr lu jt ju jv lv jx jy jz lw kb kc kd lx kf kg kh ly kj kk kl ik gb">Viaduct’s role as a data access layer puts it in the critical path of most activity on Airbnb. This makes runtime performance of utmost importance as overhead in the framework will apply universally and can have a multiplicative effect. At the same time, Viaduct accepts arbitrary queries against the unified data graph. In practice, this amounts to many thousands of heterogeneous queries in production, each of which is capable of making an arbitrary number of downstream and often concurrent calls during the course of query execution.</p><p id="fcba" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">This presented a challenge for us. Runtime overhead in Viaduct is crucial for us to monitor and improve, but we did not have a good measure for it. Metrics on end-to-end query latencies are confounded by the performance of downstream services, making it difficult to accurately judge the effect of a performance intervention in Viaduct. We needed a metric that <em class="mq">isolates </em>the performance impact of Viaduct changes from the performance impact of downstream services.</p><h1 id="8237" class="lz la ir bn lb ma mb mc lf md me mf lj mg mh mi lm mj mk ml lp mm mn mo ls mp gb">Defining Own Time</h1><p id="f6fe" class="pw-post-body-paragraph jo jp ir jq b jr lu jt ju jv lv jx jy jz lw kb kc kd lx kf kg kh ly kj kk kl ik gb">To do this, we created a metric called “own time”. Own time measures the portion of a request’s wall-clock time that occurs when there are zero downstream requests in flight. The following is pseudocode to compute own time given a root request time span and a set of downstream fetch time spans:</p><pre class="ko kp kq kr gy mr bt ms">def calculateOwnTime(rootSpan, fetchSpans):<br />  ownTime = 0<br />  maxEndTimeSoFar = rootSpan.startTime<br />  sortedFetchSpans = fetchSpans sorted by increasing start-time<br />  for fetchSpan in sortedFetchSpans:<br />    if (maxEndTimeSoFar &lt; fetchSpan.startTime)<br />      ownTime += (fetchSpan.startTime - maxEndTimeSoFar)<br />    maxEndTimeSoFar = max(maxEndTimeSoFar, fetchSpan.endTime)<br />    ownTime += (rootSpan.endTime - maxEndTimeSoFar)<br />  return ownTime</pre><p id="88df" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">The own time metric allows us to focus on aspects of Viaduct’s overhead that are clearly unrelated to downstream service dependencies. While it does not capture <em class="mq">all </em>aspects of Viaduct’s overhead, we’ve found it captures enough to be a valuable indicator of overhead costs.</p><p id="170f" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb"><strong class="jq is">Examples</strong></p><p id="965c" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">In the trivial case where all downstream calls are made serially, own time is a simple span difference of the root operation span and the sum of the downstream time spans.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mx"><img alt="" class="cf kx ky" src="https://miro.medium.com/max/1400/0*-fwKbBCLNP-GkWJP" width="700" height="413" role="presentation" /></div></div></figure><p id="fc73" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">When there are multiple downstream calls, they may be made fully or partially in parallel.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mx"><img alt="" class="cf kx ky" src="https://miro.medium.com/max/1400/0*DVUD3CPpix-Kt6h3" width="700" height="413" role="presentation" /></div></div></figure><p id="f4ed" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">In this example, the downstream calls happen partially in parallel, and the resulting own time value doesn’t include the time that any downstream request is in flight, parallel or not.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn mx"><img alt="" class="cf kx ky" src="https://miro.medium.com/max/1400/0*ezhwxOXWnSHMf1wj" width="700" height="419" role="presentation" /></div></div></figure><h1 id="e0ca" class="lz la ir bn lb ma mb mc lf md me mf lj mg mh mi lm mj mk ml lp mm mn mo ls mp gb">Identifying and Reducing Runtime Latency</h1><p id="d992" class="pw-post-body-paragraph jo jp ir jq b jr lu jt ju jv lv jx jy jz lw kb kc kd lx kf kg kh ly kj kk kl ik gb"><strong class="jq is">Measuring the influence of CPU vs. I/O on request latency</strong></p><p id="1383" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Normalizing operation own time by overall operation latency gives us an estimate of how CPU-bound vs. I/O-bound an operation is. We call this the “own time ratio” of a query. For example, the Viaduct operation graphed below had an own time ratio of 20%, indicating that 20% of the request runtime in Viaduct was spent with no downstream request in flight. After deploying an internal Viaduct performance improvement, this operation’s own time ratio dropped to 17%, since Viaduct overhead improved while downstream performance remained constant.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn my"><img alt="" class="cf kx ky" src="https://miro.medium.com/max/1400/0*wy6PADUW1fd6-F-X" width="700" height="292" role="presentation" /></div></div><figcaption class="mz bm go gm gn na nb bn b bo bp co">This graph shows a day-over-day reduction in own time ratio for a Viaduct operation after a runtime overhead improvement was deployed.</figcaption></figure><p id="7178" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">A low own time ratio for an operation indicates that the biggest overall latency gains will likely be found by optimizing downstream services, not Viaduct. A high own time ratio indicates that the meaningful latency gains can come from optimizing internal Viaduct runtime for the operation. When making such optimizations for the sake of one operation, we can also use own time ratios across all operations, and especially low-ratio ones, to ensure we aren’t introducing a regression more broadly.</p><p id="0c7c" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb"><strong class="jq is">Quantifying the impact of query size on runtime overhead</strong></p><p id="85a7" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Viaduct users reported that large queries were running slower than expected, attributing the slow execution to Viaduct overhead. Before own time, we had no metrics to assess such reports. After introducing own time, we had a starting point, but we needed to refine the metric further for this use case.</p><p id="c720" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">One would expect own time to increase as the number of fields returned by an operation increases. But was that a reasonable expectation? We found that normalizing own time by the count of fields returned by an operation yields a metric that more usefully indicates, across a heterogeneous set of operations, when own time is excessive. We defined field count to include both object fields and individual array elements.</p><p id="af07" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">The following graph shows that there is indeed an overall relationship between own time and field count across our set of operations, as well as some outliers that have unusually high own-time-to-field-count ratios.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn nc"><img alt="" class="cf kx ky" src="https://miro.medium.com/max/1400/0*MoOIoSyyD2fDoUog" width="700" height="285" role="presentation" /></div></div><figcaption class="mz bm go gm gn na nb bn b bo bp co">This chart plots the number of fields resolved against own time for unique GraphQL operations.</figcaption></figure><p id="de9c" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">This relationship between field count and own time encouraged us to focus on framework logic that runs on every field for all operations, rather than other parts of the codebase. Through some CPU profiling, we were able to quickly identify bottlenecks. One resulting improvement was a change to our multithreading model for field execution, which decreased own time for all operations by 25%.</p><p id="cb83" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb"><strong class="jq is">Quantifying the impact of internal caching on runtime overhead</strong></p><p id="687f" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Viaduct saw another performance issue. For some operations, latency appeared to vary an unusual amount, even between identical requests. Here again, we used own time to guide our investigation into root causes.</p><p id="a49f" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Viaduct relies on a number of internal caches to ensure that execution is fast, such as a cache for parsed and validated GraphQL documents. Own time metrics indicated that Viaduct runtime overhead, not downstream service dependencies, was causing the variance in latencies. We theorized that cache misses were the culprit. To test this theory, we instrumented our caches to report whether any lookup miss occurred during an operation execution and attached this hit/miss status to our own time metric output. This allowed us to report on own time by cache hit/miss status on a per-cache, per-operation basis.</p><figure class="ko kp kq kr gy ks gm gn paragraph-image"><div role="button" tabindex="0" class="kt ku dq kv cf kw"><div class="gm gn nd"><img alt="" class="cf kx ky" src="https://miro.medium.com/max/1400/0*E3j_g3qCzFxhw_IN" width="700" height="179" role="presentation" /></div></div></figure><p id="fae8" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Adding this information to own time allowed us to both confirm our theory and quantify the potential benefit of implementing a solution, such as additional cache warming or moving in-memory caches to distributed caches, prior to committing actual engineering resources. Migrating the in-memory cache that stores the validation state of GraphQL documents to a distributed cache reduced miss rates. This had a significant impact on tail latencies, especially for low QPS operations that were more likely to encounter cold cache states.</p><h1 id="8fb1" class="lz la ir bn lb ma mb mc lf md me mf lj mg mh mi lm mj mk ml lp mm mn mo ls mp gb">Setting Runtime Overhead Goals</h1><p id="d22c" class="pw-post-body-paragraph jo jp ir jq b jr lu jt ju jv lv jx jy jz lw kb kc kd lx kf kg kh ly kj kk kl ik gb">Establishing the own time metric normalized by field count ended up being a great way to account for changes in query patterns. Thus, we now use this metric, aggregated across all operations, to set framework-level performance targets that are isolated from changes in client query patterns. In particular, after measuring the base rate of normalized own time at the beginning of a quarter, we set a goal to improve normalized own time by a specific percentage quarter-over-quarter.</p><p id="9f67" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">We also use this metric, aggregated on a per-operation basis, to let operation owners know how their operation overhead compares to the rest of the system.</p><h1 id="0dc9" class="lz la ir bn lb ma mb mc lf md me mf lj mg mh mi lm mj mk ml lp mm mn mo ls mp gb">Integrating Own Time Into The Release Cycle</h1><p id="aac7" class="pw-post-body-paragraph jo jp ir jq b jr lu jt ju jv lv jx jy jz lw kb kc kd lx kf kg kh ly kj kk kl ik gb">To quantify the runtime performance impact of a change, we can set up experiments where two staged control and treatment applications receive identical production traffic replay. We can then graph the difference in own time between them. This allows us to quantify the impact of various framework interventions on runtime overhead and measure each intervention’s impact against our performance goals.</p><p id="925e" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">While replay experiments help us to assess the potential runtime improvements of a change on a limited set of use cases, narrowly-targeted optimizations can lead to broader performance regressions may still happen accidentally. To guard against such regressions, we leverage an automated canary analysis process before deployment. A canary instance and baseline instance receive identical production replay traffic for a period of time, and large discrepancies between them can automatically stop the deployment process. By inspecting the own time difference between the canary and baseline instances, we can identify unexpected performance regressions prior to the regression making it to production.</p><p id="a001" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">In addition to automated canary analysis, graphing day-over-day, week-over-week and month-over-month own time in production shows us long-term isolated performance trends and allows us to bisect any regressions that make it to production.</p><h1 id="2783" class="lz la ir bn lb ma mb mc lf md me mf lj mg mh mi lm mj mk ml lp mm mn mo ls mp gb">Limitations and Future Work</h1><p id="59ef" class="pw-post-body-paragraph jo jp ir jq b jr lu jt ju jv lv jx jy jz lw kb kc kd lx kf kg kh ly kj kk kl ik gb">By ignoring what Viaduct does during all downstream calls, own time does not account for possible optimizations to a call pattern of the downstream requests themselves. For example, a request execution may be sped up by increasing concurrency of downstream calls or removing some calls altogether.</p><p id="4f04" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Although own time gives a measure of wall-clock runtime service overhead, it does not say <em class="mq">what</em>is causing the overhead or how to best improve it, which will vary across operations in a GraphQL server. However, tracking downstream request spans in memory provides baseline data that can be enriched with other metadata and further filtered to measure the contribution of application-specific activity to own time.</p><p id="f1ed" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">Tracking down the root cause of unexpected own time changes or understanding why an operation is an own time outlier requires manual inspection and sometimes additional one-off measurements, which take valuable engineering time. We can automate the first steps in these investigations by measuring the contribution of various parts of the application to own time. This would speed up root cause analysis and limit time spent manually profiling CPU usage.</p><h1 id="4287" class="lz la ir bn lb ma mb mc lf md me mf lj mg mh mi lm mj mk ml lp mm mn mo ls mp gb">Conclusion</h1><p id="15ad" class="pw-post-body-paragraph jo jp ir jq b jr lu jt ju jv lv jx jy jz lw kb kc kd lx kf kg kh ly kj kk kl ik gb">Own time has allowed us to isolate the runtime performance characteristics of Viaduct, our GraphQL-based data-oriented service mesh. Using own time, we can precisely measure the production runtime performance effects of application changes, set downstream-independent performance goals, and measure our long-term progress against those goals for an arbitrary underlying application. Enriching own time with application-specific data, such as fetched field counts and cache hit/miss states in Viaduct, gives us an overarching view of the relationship between an application’s state and its runtime performance characteristics.</p></div><div class="o dz ne nf ng nh" role="separator"><div class="ik il im in io"><p id="4a05" class="pw-post-body-paragraph jo jp ir jq b jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl ik gb">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</p></div></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/measuring-latency-overhead-with-own-time-f4373f586ca</link>
      <guid>https://medium.com/airbnb-engineering/measuring-latency-overhead-with-own-time-f4373f586ca</guid>
      <pubDate>Mon, 21 Mar 2022 21:53:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Artificial Counterfactual Estimation (ACE): Machine Learning-Based Causal Inference at Airbnb]]></title>
      <description><![CDATA[<header class="pw-post-byline-header fh fi fj fk fl fm fn fo fp fq l"><div class="o u"><div class="o"><div class="eb l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@twozhiying?source=post_page-----ee32ee4d0512-----------------------------------"><div class="l cj"><img alt="zhiying gu" class="l dl ed fr fs" src="https://miro.medium.com/fit/c/96/96/0*pDg-pgiuhW968pS3.jpg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author cg b ch ci et"><div class="ft o fu"><div><div class="ck" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@twozhiying?source=post_page-----ee32ee4d0512-----------------------------------">zhiying gu</a></div></div><div class="fv fw fx fy fz d"></div></div><div class="o ao gq"><p class="pw-published-date cg b ej ek bw">Mar 16</p><div class="gr ck" aria-hidden="true">·</div><div class="pw-reading-time cg b ej ek bw">10 min read</div></div></div></div></div></div></header><section><div><div class="hh hi hj hk hl"><div class=""><figure class="fi fk im in io ip fe ff paragraph-image"><div role="button" tabindex="0" class="iq ir cj is dq it"><div class="fe ff il"><img alt="" class="dq iu iv" src="https://miro.medium.com/max/1400/0*EQ_C2aqZE91XHEJ4" width="700" height="467" role="presentation" /></div></div></figure><p id="0155" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et"><strong class="iy hp">By:</strong><a class="au ju" href="https://www.linkedin.com/in/zhiying-gu-2293a353/" rel="noopener ugc nofollow" target="_blank"><strong class="iy hp"> Zhiying Gu</strong></a><strong class="iy hp">, </strong><a class="au ju" href="https://www.linkedin.com/in/qianrongwu/" rel="noopener ugc nofollow" target="_blank"><strong class="iy hp">Qianrong Wu</strong></a></p><h1 id="aabe" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Summary</h1><p id="1736" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">What if you wanted to measure the impact of a change to your business, but it was not possible to run a randomized controlled experiment? That’s exactly the problem we faced when measuring the benefit of a new tool used by Airbnb operations to automate part of their workflow. Due to organizational constraints, it was simply not possible to randomly assign the tool to operations agents; even if we could make random assignments, the sample sizes were too small to generate sufficient statistical power. So what did we do? We imagined a parallel universe in which the operations agents who did not use the new tool were identical in all respects to those who did–in other words, a world in which the assignment criteria were as good as random. In this blog post, we explain this new methodology, called <strong class="iy hp">ACE (Artificial Counterfactual Estimation)</strong>, which leverages machine learning (ML) and causal inference to artificially reproduce the “counterfactual” scenario produced by random assignment. We’ll explain how this works in practice, why it is better than other methods such as matching and synthetic control, and how we overcame challenges associated with this method.</p><h1 id="b243" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">The Non-Randomizable Operations Problem</h1><p id="062c" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">There are two key assumptions undergirding randomized controlled <a class="au ju" rel="noopener" href="https://medium.com/airbnb-engineering/experiments-at-airbnb-e2db3abf39e7">experiments</a> (often referred to as “A/B tests”):</p><ol class=""><li id="8f99" class="ky kz ho iy b iz ja jd je jh la jl lb jp lc jt ld le lf lg et">The treatment and control groups are similar. When you have similar groups, outcomes are independent of group attributes such as age, gender, and location, meaning that any difference between the groups can be attributed to a treatment that was received by one group but not the other. In statistical terms, we assume that we have controlled all confounders, thereby reducing the bias of our estimates.</li><li id="dfe1" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt ld le lf lg et">The sample sizes are sufficiently large. Large sample sizes serve to reduce the magnitude of chance differences between the two randomized groups, giving us confidence that the treatment has a true causal impact. In technical lingo, we assume that we have reduced the variance of our estimates enough to give us appropriate statistical power.</li></ol><p id="170d" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">Given the need for similar groups and large sample sizes when running A/B tests, any organization with operational teams presents challenges. To start, there are general concerns about unfairness and disruptive experience when running randomized experiments on operations agents. Second, the operational sites are located in different countries with varied amounts of employees, skill levels and so on so we cannot simply assign certain geographies to treatment and some to control without introducing apples-to-oranges comparison, which will lead to bias of the measurement. Finally, we have millions of customers but not millions of operations agents, so the sample size for this test is always going to be much smaller than that of other experiments.</p><h1 id="00f3" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">ACE to the Rescue</h1><p id="dbea" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">With the ACE (<strong class="iy hp">Artificial Counterfactual Estimation)</strong>, we have the next best thing to a randomized experiment. The trick is to achieve <strong class="iy hp">bias reduction and variance reduction</strong> at the same time using a machine learning-based causal impact estimation technique.</p><p id="1fa8" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">Causal inference is a process of estimating the counterfactual outcome that would have occurred had the treated units not been treated. In our case, we want to know how productive our operations agents would have been, had they not used the new workflow automation tool. There are many ways to construct such a counterfactual outcome, but the most common methods are:</p><ul class=""><li id="ac72" class="ky kz ho iy b iz ja jd je jh la jl lb jp lc jt lm le lf lg et">Use the control group from a randomized controlled experiment (unfortunately, is often times not possible in our case)</li><li id="0601" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">Construct a group that is similar to the treated group using matching methods such as Propensity Score Matching (Weighting), <a class="au ju" href="https://gking.harvard.edu/files/political_analysis-2011-iacus-pan_mpr013.pdf" rel="noopener ugc nofollow" target="_blank">Coarsened Exact Matching</a>, or <a class="au ju" href="https://web.stanford.edu/~jhain/Paper/PA2012.pdf" rel="noopener ugc nofollow" target="_blank">Entropy Balancing</a></li><li id="ae84" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">Construct the counterfactual outcome with time-series predictions (e.g., <a class="au ju" href="https://research.google/pubs/pub41854/" rel="noopener ugc nofollow" target="_blank">Causal Impact Model</a>)</li><li id="3ff0" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">Construct the counterfactual outcome as the weighted average of all non-treated units (<a class="au ju" href="https://economics.mit.edu/files/11859" rel="noopener ugc nofollow" target="_blank">Synthetic Control</a>, <a class="au ju" href="https://www.cambridge.org/core/journals/political-analysis/article/generalized-synthetic-control-method-causal-inference-with-interactive-fixed-effects-models/B63A8BD7C239DD4141C67DA10CD0E4F3" rel="noopener ugc nofollow" target="_blank">Generalized Synthetic Control</a>)</li></ul><p id="0409" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">We can construct the counterfactual outcome by ML prediction using both confounding and non-confounding factors as features. In a nutshell, we use a holdout group (i.e., the group not treated)) to train an ML model that predicts the counterfactual outcome being not treated in the post-treatment period. We then apply the trained model to the treated group <strong class="iy hp">for the same period. </strong>Thepredicted outcome serves as the counterfactual (new control) representing the imagined scenario in which the treatment group had not been treated in the post-treatment period(<em class="ln">Y’’</em>in the equation below).</p><figure class="lp lq lr ls fq ip fe ff paragraph-image"><div class="fe ff lo"><img alt="" class="dq iu iv" src="https://miro.medium.com/max/388/1*NKttYN8--2qKUiGFtU6b3w.png" width="194" height="54" role="presentation" /></div></figure><p id="dced" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et"><strong class="iy hp">In the equation above,<em class="ln"> t </em>is the difference between the observed treatment group outcome </strong>(Y)<strong class="iy hp"> and the predicted outcome </strong>(<em class="ln">Y’’</em>). It represents <strong class="iy hp">a <em class="ln">naive</em> estimate of the impact </strong>because it<strong class="iy hp"> is <em class="ln">biased</em>. </strong>The following graph illustrates ACE at a high level. It has the following steps as illustrated in Figure 1:</p><ol class=""><li id="0035" class="ky kz ho iy b iz ja jd je jh la jl lb jp lc jt ld le lf lg et">We train a machine learning model using data from a hold out group, i.e. a group without treatment.</li><li id="ff63" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt ld le lf lg et">We apply the trained model on the treatment group to obtain the predicted outcome had we not applied treatment on this group.</li><li id="4fe4" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt ld le lf lg et">The difference between the actual and the predicted outcome for the treatment group is the estimated impact.</li></ol><p id="886d" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">We will flesh out the detailed challenges in a later section before its application.</p><figure class="lp lq lr ls fq ip fe ff paragraph-image"><div role="button" tabindex="0" class="iq ir cj is dq it"><div class="fe ff lt"><img alt="" class="dq iu iv" src="https://miro.medium.com/max/1400/0*GymYMQAYRWRjpbf-" width="700" height="308" role="presentation" /></div></div><figcaption class="lu lv fg fe ff lw lx cg b ej ek bw">Figure 1: Estimation Process</figcaption></figure><h1 id="4655" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Challenges of ACE, and Solutions</h1><p id="bdb2" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">There are two major challenges in developing ACE: bias estimation and construction of confidence intervals.</p><h1 id="dd72" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Challenge 1: Bias estimation</h1><p id="ede0" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">The predicted outcome <strong class="iy hp"><em class="ln">Y’’</em> </strong>from the machine learning models is often biased for two reasons, causing the estimated causal impact <em class="ln">t</em> to also be biased (see <a class="au ju" href="https://academic.oup.com/ectj/article/21/1/C1/5056401" rel="noopener ugc nofollow" target="_blank">Chernozhukov et. al. (2018)</a>). The two reasons for bias are 1) regularization, and 2) overfitting.</p><p id="8c1d" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">The figure below shows the ML model prediction error on 100 synthetic A/A tests, for which the estimated impact should always be zero. Clearly, however, the distribution of estimates is not centered around zero. The average prediction error is actually 2%, meaning that the ML prediction <em class="ln">Y’’</em> is, on average, overestimated by 2%.</p><figure class="lp lq lr ls fq ip fe ff paragraph-image"><div class="fe ff ly"><img alt="" class="dq iu iv" src="https://miro.medium.com/max/984/0*AQMF97y43O2klwcr" width="492" height="348" role="presentation" /></div><figcaption class="lu lv fg fe ff lw lx cg b ej ek bw">Figure 2: Prediction Bias</figcaption></figure><h1 id="8ac0" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Challenge 2: Construction of Confidence Intervals</h1><p id="ad66" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">Unlike in a traditional t-test for A/B testing, there is no analytical solution for confidence intervals when we are doing ACE. As a result, we have to construct empirical confidence intervals for the estimates. To address these two challenges, we took an empirical approach to removing bias from the prediction and then constructed our confidence intervals based on that same empirical approach.</p><p id="e1cd" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">In ACE, we use A/A tests both for debiasing and for constructing confidence Intervals.</p><h1 id="700c" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Solution to Challenge 1: Debias</h1><p id="1338" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">One natural idea is that if we can confidently estimate the magnitude of the bias, we can simply adjust the prediction by the estimated bias. The estimation then becomes:</p><figure class="lp lq lr ls fq ip fe ff paragraph-image"><div class="fe ff lz"><img alt="" class="dq iu iv" src="https://miro.medium.com/max/816/1*tf49yKKBFKPqxqyxkTa4EA.png" width="408" height="142" role="presentation" /></div></figure><p id="9336" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">Practitioners can freely choose any machine learning models to use — <em class="ln">f(X) </em>— for the prediction of <em class="ln">Y’’.</em> Figure 2 shows a 2% bias for 100 A/A samples. The question is: can we say the true bias is 2%? If we can verify that the bias is systematically 2% (i.e., consistent across different A/A samples during the same periods and repeatable across different time periods), we can say bias = 2%. Figure 3 shows the repeatability of the bias estimation over time. The estimates are always biased upwards and the average estimates of bias are around 2%. Figure 4 shows the average prediction error after removing the bias (2%). With bias correction, the distribution of estimated impact is centered around zero.</p><figure class="lp lq lr ls fq ip fe ff paragraph-image"><div role="button" tabindex="0" class="iq ir cj is dq it"><div class="fe ff il"><img alt="" class="dq iu iv" src="https://miro.medium.com/max/1400/0*3VxewhXE-I6q6e3B" width="700" height="397" role="presentation" /></div></div><figcaption class="lu lv fg fe ff lw lx cg b ej ek bw">Figure 3: the stability of bias estimation</figcaption></figure><figure class="lp lq lr ls fq ip fe ff paragraph-image"><div role="button" tabindex="0" class="iq ir cj is dq it"><div class="fe ff ma"><img alt="" class="dq iu iv" src="https://miro.medium.com/max/1400/0*BIyfw_s9Jo80_zya" width="700" height="469" role="presentation" /></div></div><figcaption class="lu lv fg fe ff lw lx cg b ej ek bw">Figure 4: Distribution of impact estimates based on A/A after bias correction</figcaption></figure><h1 id="49c2" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Solution to Challenge 2: Construct Empirical Confidence Intervals</h1><p id="3dc2" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">We can use data from A/A tests to construct empirical confidence intervals and p-values.</p><ul class=""><li id="aa58" class="ky kz ho iy b iz ja jd je jh la jl lb jp lc jt lm le lf lg et">Empirical confidence interval: to be more specific, the 95% confidence interval is constructed by looking at the distribution of 100 bootstrapped A/A samples. Given that we know the true differences of A/A tests are 0, and if 5% of estimated impacts from 100 A/A tests are outside [-0.2, 0.2] range, then we know the 95% confidence interval is [-0.2, 0.2].</li><li id="6a4c" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">Empirical p-value: we can estimate Type I error via A/A tests estimated from ML models as follows. Suppose we estimated a 3% of the impact for the treatment. P-value is to estimate the probability of obtaining an estimate that is outside [-3%, 3%] when the null hypothesis is true — there is no impact. This probably is estimated with the empirical distribution of iterative A/A tests. If the probability is 1%, we will conclude that we have at least 98% (i.e 100% — (1%*2)) confidence that the alternative hypothesis — the impact is not zero — is true.</li></ul><h1 id="9d57" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Validation</h1><p id="8650" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">To validate if ACE can accurately measure the impact, we further ACE to the data from a large scale randomized A/B data and compared ACE results with the A/B tests results. The result from the A/B test is considered as ground truth for validation because A/B testing is the gold standard for measurement. The results are nearly identical.</p><h1 id="13d1" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Advantages of ACE</h1><p id="cadf" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">There are several advantages of ACE over other estimation methods:</p><ul class=""><li id="e9ef" class="ky kz ho iy b iz ja jd je jh la jl lb jp lc jt lm le lf lg et">It is flexible in the choice of estimation model. We can freely choose any cutting-edge ML models to achieve desired level of accuracy, based on various use cases and data properties..</li><li id="52a2" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">Its validity and accuracy can be easily assessed during the design phase of the measurement plan by conducting A/A tests.</li><li id="9365" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">It can be applied on both experimental data for variance reduction and on non-experimental data for bias correction as well as for variance reduction.</li><li id="5baf" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">For experimental data:<br />- It is less prone to biases compared to regression adjustments. <br />- It has more power compared to stratification when the ML model has a good performance. <br />- It estimates the magnitude of the impacts instead of only the existence of the impacts compared to rank tests.</li></ul><p id="fd7d" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">You’ll recall that we applied ACE to estimate the incremental benefit of a tool that helps operations agents to automate part of their workflow. We generated p-values for three different measurement methodologies: (1) classic t-test; (2) <a class="au ju" href="https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test" rel="noopener ugc nofollow" target="_blank">non-parametric rank test</a> and (3) ACE non-parametric test based on the empirical confidence interval we described in the previous section. The following is a performance comparison for t-test, rank test, and ML-based methods for the same sample size, in particular, when sample size is small when we try to conduct inference with classic t-test as we do in A/B testing.</p><figure class="lp lq lr ls fq ip fe ff paragraph-image"><div role="button" tabindex="0" class="iq ir cj is dq it"><div class="fe ff il"><img alt="" class="dq iu iv" src="https://miro.medium.com/max/1400/0*EnfVbOcLKkcDHoEa" width="700" height="207" role="presentation" /></div></div></figure><h1 id="9488" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Recap</h1><p id="6e0b" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">In this blog post, we explained how one can leverage ML for counterfactual prediction, using an estimation problem for the efficacy of an agent tool as our motivating example.</p><p id="d36d" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">Combining statistical inference and machine learning methods is a powerful approach when it’s not possible to run an A/B test. However, as we have seen, it can be dangerous to apply ML methodologies if intrinsic model bias is not addressed.. This post outlined a practical and reliable way to correct for this intrinsic bias, while minimizing Type I error relative to competing methods.</p><p id="1ff0" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">Currently, we are working to turn our code template into an easy-to-use Python package that will be accessible to all data scientists within the company.</p><p id="4632" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et">If this type of work interests you, check out some of our related positions!</p><p id="62a4" class="pw-post-body-paragraph iw ix ho iy b iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js jt hh et"><a class="au ju" href="https://careers.airbnb.com/positions/3859241/" rel="noopener ugc nofollow" target="_blank">Senior Data Scientist — Payments</a></p><h1 id="4a5c" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Acknowledgments</h1><p id="119b" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et">Thanks to Alex Deng and Lo-hua Yuan for providing feedback on the development of ACE and spending time reviewing the work. We would also like to thank Airbnb Experiment Review Committee Members for feedback and comments. Last but not least, we really appreciate Joy Zhang and Nathan Triplett for their guidance, and feedback and support from Tina Su, Raj Rajagopal and Andy Yasutake.</p><h1 id="05e3" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">References</h1><ul class=""><li id="0542" class="ky kz ho iy b iz kt jd ku jh mb jl mc jp md jt lm le lf lg et">Stefano M. Iacus, King, Gary, Giuseppe Porro, 2017. <a class="au ju" href="https://gking.harvard.edu/files/political_analysis-2011-iacus-pan_mpr013.pdf" rel="noopener ugc nofollow" target="_blank">Causal Inference without Balance Checking: Coarsened Exact Matching</a>, <em class="ln">Political Analysis.</em></li><li id="1aad" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">Jens Hainmueller, 2012, <a class="au ju" href="https://web.stanford.edu/~jhain/Paper/PA2012.pdf" rel="noopener ugc nofollow" target="_blank">Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies</a>, <em class="ln">Political Analysis.</em></li><li id="4691" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et"><a class="au ju" href="https://research.google/people/KayBrodersen/" rel="noopener ugc nofollow" target="_blank">Kay H. Brodersen</a>, Fabian Gallusser, Jim Koehler, <a class="au ju" href="https://research.google/people/NicolasRemy/" rel="noopener ugc nofollow" target="_blank">Nicolas Remy</a>, Steven L. Scott, 2015. <a class="au ju" href="https://research.google/pubs/pub41854/" rel="noopener ugc nofollow" target="_blank">Inferring causal impact using Bayesian structural time-series models</a>, <em class="ln">Annals of Applied Statistics</em>.</li><li id="8002" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">Alberto Abadie, Alexis Diamond, and Jens Hainmueller, 2010. <a class="au ju" href="https://economics.mit.edu/files/11859" rel="noopener ugc nofollow" target="_blank">Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program</a>, <em class="ln">Journal of the American Statistical Association.</em></li><li id="3375" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">Yiqing Xu, 2017.<a class="au ju" href="https://www.cambridge.org/core/journals/political-analysis/article/generalized-synthetic-control-method-causal-inference-with-interactive-fixed-effects-models/B63A8BD7C239DD4141C67DA10CD0E4F3" rel="noopener ugc nofollow" target="_blank">Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models</a>, <em class="ln">Political Analysis.</em></li><li id="f74b" class="ky kz ho iy b iz lh jd li jh lj jl lk jp ll jt lm le lf lg et">Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, James Robins, 2018. <a class="au ju" href="https://academic.oup.com/ectj/article/21/1/C1/5056401" rel="noopener ugc nofollow" target="_blank">Double/debiased machine learning for treatment and structural parameters</a>,<em class="ln"> The Econometrics Journal.</em></li></ul><h1 id="6287" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">Further Reading on Similar Topic</h1><ul class=""><li id="7dc5" class="ky kz ho iy b iz kt jd ku jh mb jl mc jp md jt lm le lf lg et"><a class="au ju" rel="noopener" href="https://medium.com/airbnb-engineering/how-airbnb-measures-future-value-to-standardize-tradeoffs-3aa99a941ba5">How Airbnb Measures Future Value to Standardize Tradeoff</a></li></ul><h1 id="67f7" class="jv jw ho cg jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks et">****************</h1><p id="3adb" class="pw-post-body-paragraph iw ix ho iy b iz kt jb jc jd ku jf jg jh kv jj jk jl kw jn jo jp kx jr js jt hh et"><em class="ln">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/artificial-counterfactual-estimation-ace-machine-learning-based-causal-inference-at-airbnb-ee32ee4d0512</link>
      <guid>https://medium.com/airbnb-engineering/artificial-counterfactual-estimation-ace-machine-learning-based-causal-inference-at-airbnb-ee32ee4d0512</guid>
      <pubDate>Wed, 16 Mar 2022 20:34:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Rebuilding Payment Orchestration at Airbnb]]></title>
      <description><![CDATA[<header class="pw-post-byline-header fj fk fl fm fn fo fp fq fr fs l"><div class="o u"><div class="o"><div class="eb l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@bryonr?source=post_page-----341d194a781b-----------------------------------"><div class="l cj"><img alt="Bryon Ross" class="l dl ed ft fu" src="https://miro.medium.com/fit/c/96/96/1*UaYOQDZu4nVVlFmHfBbiIA.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="pw-author cg b ch ci et"><div class="fv o fw"><div><div class="ck" role="tooltip" aria-hidden="false"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@bryonr?source=post_page-----341d194a781b-----------------------------------">Bryon Ross</a></div></div><div class="fx fy fz ga gb d" aria-hidden="true">·</div><div class="fy fz ga gb d"><div class="cg b ch ci ev"><a class="ev gc aw ax ay az ba bb bc bd gd ge bg gf gg" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2F9d1363f2c38a&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Frebuilding-payment-orchestration-at-airbnb-341d194a781b&amp;user=Bryon+Ross&amp;userId=9d1363f2c38a&amp;source=post_page-9d1363f2c38a----341d194a781b---------------------follow_byline--------------">Follow</a></div></div></div></div><div class="o ao gh"><p class="pw-published-date cg b ej ek bw">Feb 24</p><div class="fx ck" aria-hidden="true">·</div><div class="pw-reading-time cg b ej ek bw">10 min read</div></div></div></div><div class="gi gj gk gl gm d"><div class="ck" aria-hidden="false"></div></div></div></header><section><div><div class="gz ha hb hc hd"><div class=""><div class=""><h2 id="4b14" class="pw-subtitle-paragraph id hf hg cg b ie if ig ih ii ij ik il im in io ip iq ir is it iu bw">How we maintained reliable money movement while migrating Airbnb’s payment orchestration system from the legacy monolithic application to a service-oriented architecture</h2></div><p id="d6aa" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et"><strong class="ix hh">By:</strong> <a class="au jr" href="https://www.linkedin.com/in/bryon-ross/" rel="noopener ugc nofollow" target="_blank">Bryon Ross</a>, <a class="au jr" href="https://www.linkedin.com/in/feifeng-yang-339b8b33/" rel="noopener ugc nofollow" target="_blank">Feifeng Yang</a>, <a class="au jr" href="https://www.linkedin.com/in/sophie-behr-6874b734/" rel="noopener ugc nofollow" target="_blank">Sophie Behr</a>, <a class="au jr" href="https://www.linkedin.com/in/johnsont/" rel="noopener ugc nofollow" target="_blank">Theresa Johnson</a>, <a class="au jr" href="https://www.linkedin.com/in/xin-lin-39527b58/" rel="noopener ugc nofollow" target="_blank">Xin Lin</a>, <a class="au jr" href="https://www.linkedin.com/in/yunjincho/" rel="noopener ugc nofollow" target="_blank">Yun Jin Cho</a></p><figure class="jt ju jv jw fs jx fg fh paragraph-image"><div role="button" tabindex="0" class="jy jz cj ka dq kb"><div class="fg fh js"><img alt="" class="dq kc kd" src="https://miro.medium.com/max/1400/1*xkVmheTk8AghzmfFEJuIKg.jpeg" width="700" height="467" role="presentation" /></div></div></figure><h1 id="590b" class="ke kf hg cg kg kh ki kj kk kl km kn ko im kp in kq ip kr iq ks is kt it ku kv et">Introduction</h1><p id="0c7c" class="pw-post-body-paragraph iv iw hg ix b iy kw ih ja jb kx ik jd je ky jg jh ji kz jk jl jm la jo jp jq gz et">Airbnb’s payment orchestration system is responsible for ensuring reliable money movement between hosts, guests, and Airbnb. In short, guests should be charged the right amount at the right time using their selected payment methods; hosts should be paid the right amount at the right time to their desired payout methods. For historical reasons, Airbnb’s billing data, payment APIs, payment orchestration, and user experiences were tightly coupled with the concept of a reservation for a stay. Unfortunately, this meant that a payment-related feature for stays had to be rebuilt for other products — for example, Airbnb Experiences — and each implementation may have its own product-specific quirks. As you can imagine, this approach is neither scalable nor easy to maintain.</p><p id="90da" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">For several years, Airbnb has been migrating away from our monolithic Ruby on Rails application toward a service-oriented architecture (SOA). This migration has been discussed extensively in several Airbnb <a class="au jr" rel="noopener" href="https://medium.com/airbnb-engineering/building-services-at-airbnb-part-1-c4c1d8fa811b">tech</a> <a class="au jr" rel="noopener" href="https://medium.com/airbnb-engineering/building-services-at-airbnb-part-2-142be1c5d506">blog</a> <a class="au jr" rel="noopener" href="https://medium.com/airbnb-engineering/building-services-at-airbnb-part-3-ac6d4972fc2d">posts</a>. We will gloss over some of the technical discussions common to those migrations and instead focus on some of the aspects that were unique to migrating our payments systems. While many teams at Airbnb chose to create a one-to-one replacement when migrating to SOA, the payments organization instead decided to use it as an opportunity to fundamentally redesign our services to provide a sound technical foundation for future growth. As a consequence of this decision, the migration process took longer to complete than a more straightforward one-to-one replacement.</p><h1 id="d055" class="ke kf hg cg kg kh ki kj kk kl km kn ko im kp in kq ip kr iq ks is kt it ku kv et">Why Redesign?</h1><figure class="jt ju jv jw fs jx"><div class="m l cj"><figcaption class="ld le fi fg fh lf lg cg b ej ek bw">Airbnb CEO Brian Chesky tells a story about the origin of payments at Airbnb</figcaption></div></figure><p id="78a8" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">As Brian shared in the video above, support for on-platform payments has played a critical role in establishing trust among Airbnb’s hosts and guests. Airbnb has grown significantly since our first payment system was created over a decade ago and, with that growth, the scope and scale of payments at Airbnb have also grown and changed. Many of the original payment models were tied closely to reservations for a stay. This made sense in the early days of Airbnb as there was only one product, and the engineers working on payments at that time did an excellent job developing a solution that solved the needs of guests and hosts. While these original models used for payments have proven extremely versatile and powerful, this tight coupling between stays and payments has led to increased complexity when adding new products like Experiences or features like the Resolution Center.</p><p id="bd30" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">When planning for the SOA migration, Airbnb’s payments teams made a bold decision to fundamentally redesign the payments system. Our goal was to create a payment platform that would allow teams across Airbnb to quickly, easily, and safely integrate new features and products with payments. It’s not feasible to list all of the enhancements in a single blog post, so this post will focus on some design highlights affecting the new payment orchestration system: idempotency, platformization, and data immutability.</p><h2 id="d659" class="lh kf hg cg kg li lj lk kk ll lm ln ko je lo lp kq ji lq lr ks jm ls lt ku lu et">Idempotent Orchestration</h2><p id="5487" class="pw-post-body-paragraph iv iw hg ix b iy kw ih ja jb kx ik jd je ky jg jh ji kz jk jl jm la jo jp jq gz et">As discussed in an <a class="au jr" rel="noopener" href="https://medium.com/airbnb-engineering/avoiding-double-payments-in-a-distributed-payments-system-2981f6b070bb">earlier blog post</a>, idempotency is a common technique to maintain consistency among distributed services. The new payment orchestration system was designed around Orpheus (the idempotency framework described in that post). Every major workflow is divided into a directed acyclic graph (DAG) of retryable idempotent steps, each with well-defined behavior. This allows the payment orchestration layer to maintain eventual consistency with other key services (such as the payment gateway layer and product fulfillment services). This approach has led to five 9s (99.999%) of consistency for payments.</p><p id="793e" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">The idempotency framework works well for both synchronous and asynchronous communication between services. For asynchronous communication, payments services primarily use a Kafka-based message bus to send “events” to one another. Event processors use the idempotency framework to enhance the at-least-once guarantee of Kafka into an exactly-once guarantee. The transactional integrity analysis tools described in <a class="au jr" rel="noopener" href="https://medium.com/airbnb-engineering/measuring-transactional-integrity-in-airbnbs-distributed-payment-ecosystem-a670d6926d22">this post</a> provide an additional layer of confidence by ensuring consistency between events and transactional data sources.</p><h2 id="f99b" class="lh kf hg cg kg li lj lk kk ll lm ln ko je lo lp kq ji lq lr ks jm ls lt ku lu et">Product-Agnostic Platform</h2><figure class="jt ju jv jw fs jx fg fh paragraph-image"><div role="button" tabindex="0" class="jy jz cj ka dq kb"><div class="fg fh lv"><img alt="" class="dq kc kd" src="https://miro.medium.com/max/1400/1*wXTahHCWmRKVpdsQeFwa3w.png" width="700" height="664" role="presentation" /></div></div><figcaption class="ld le fi fg fh lf lg cg b ej ek bw">The payments SOA migration decoupled product fulfillment, payment orchestration, and pricing</figcaption></figure><p id="5dc2" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">A significant disadvantage of our legacy payment data models is that they were closely tied to a single product, reservations for stays. For this reason, our new payment orchestration service was intentionally architected to avoid tightly coupling the payments system to any particular product. Instead, the new orchestration layer was designed around generic payment-specific workflows (e.g., validation, payment processing, financial reporting) with payment-specific logic and product-specific logic isolated from one another, with the exception of a few well-defined integration points. When combined with the generic billing and pricing APIs described in <a class="au jr" rel="noopener" href="https://medium.com/airbnb-engineering/scaling-airbnbs-payment-platform-43ebfc99b324">this blog post</a>, this approach allows new products to integrate quickly and easily with existing generic payment flows, drastically reducing both engineering effort and time to delivery. Additionally, as new features are added to the payment systems, these features can be easily adopted by other products.</p><h2 id="406b" class="lh kf hg cg kg li lj lk kk ll lm ln ko je lo lp kq ji lq lr ks jm ls lt ku lu et">Data Immutability</h2><p id="3d87" class="pw-post-body-paragraph iv iw hg ix b iy kw ih ja jb kx ik jd je ky jg jh ji kz jk jl jm la jo jp jq gz et">Immutable data is easier to understand, audit, and reconcile. All of the new payment services were built around the idea of data immutability. For payment orchestration, data immutability manifests in two major forms: persistent events and versioning. Events are naturally append-only. It is the responsibility of the event consumer to determine if a new event represents a modification to an existing event. When an existing product is altered (e.g., adding another night to a stay), the modifications to the payment orchestration plan are modeled as a new version in a sequence of plans for that product. The combined information from all the versions provides a complete history of the intended and actual money movement related to that product.</p><h1 id="d704" class="ke kf hg cg kg kh ki kj kk kl km kn ko im kp in kq ip kr iq ks is kt it ku kv et">A Phased Migration</h1><p id="acd9" class="pw-post-body-paragraph iv iw hg ix b iy kw ih ja jb kx ik jd je ky jg jh ji kz jk jl jm la jo jp jq gz et">Various teams at Airbnb took different approaches to the migration towards a service-oriented architecture (SOA). Many teams chose to migrate functionality in small blocks, replacing the legacy implementation with an equivalent SOA one. Generally, with this approach, the existing system would be broken down into discrete, cohesive, functional blocks. Each block could be migrated mostly independently of the others. The behavior of each block would be well defined and the result could be trivially compared across both systems to ensure consistent results.</p><p id="8a42" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">The Airbnb payments organization took a different approach for the migration of the various payments systems. Instead of small functional blocks, the migration for the payments systems was broken down into four major phases: Pricing, Payouts, Bookings, and Data Migration. The Pricing phase remodeled each of the product-specific pricing models into a generic model that could be used across all Airbnb products. The Payouts and Bookings phases fundamentally redesigned the way that money movement is orchestrated at Airbnb to more easily support new products, features, and business needs. The majority of the work related to payment orchestration was contained within these phases. The Data Migration phase migrated existing bookings from the legacy system to SOA, allowing the legacy system to be wound down and deprecated.</p><p id="c2b8" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">Within each phase, the migration was divided into smaller migrations, usually by feature or product. For example, in the Bookings phase, bookings for stays were migrated independently from bookings for experiences. When reasonable, those subphases were further broken down as well. The migration of bookings for stays was subdivided into over 30 milestones based on characteristics of the bookings. The relatively small scope of each milestone allowed engineers and data scientists to thoroughly test and validate each set of migrations. Additionally, the relatively independent nature of each milestone allowed many of them to be completed in parallel.</p><h1 id="538b" class="ke kf hg cg kg kh ki kj kk kl km kn ko im kp in kq ip kr iq ks is kt it ku kv et">Maintaining Two Systems</h1><p id="c132" class="pw-post-body-paragraph iv iw hg ix b iy kw ih ja jb kx ik jd je ky jg jh ji kz jk jl jm la jo jp jq gz et">The new payment orchestration system introduced a fundamentally redesigned data model based around the concept of a bill. Unlike the legacy model, the new data model is not tied to any specific product, but rather focuses on being sufficiently powerful, extensible, and generic to be useful for existing and future Airbnb products. One important consequence of fundamentally redesigning the payment data model was that it became non-trivial to convert from one data model to another.</p><p id="22cc" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">In general, historical bookings and payouts were not moved from one system to another as part of the initial migration process. Rather, new bookings and payouts would be routed to SOA if they were deemed eligible. Otherwise, they would continue to be routed to the legacy system. Throughout most of the migration process, existing bookings would continue to proceed through their lifecycle in the legacy monolithic system. Only at the tail end of the migration were active bookings transitioned from the legacy system to SOA. As a result, engineering teams needed to maintain two parallel payment orchestration systems throughout virtually the entire migration process.</p><p id="c802" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">Most consumers of payments data don’t actually care whether the data is stored in the legacy or SOA system; they just want the data. In order to provide an easy and consistent experience for those client services, a new transformation layer was built to transparently retrieve data from the correct underlying source and to seamlessly convert them into a unified data model that could be consumed by all clients. The translation layer proved incredibly valuable as it decoupled the work of the teams working on the migration from the work of the client teams.</p><p id="3528" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">Nothing happens in a vacuum. While the migration was in progress, business needs arose and features had to be added to the payment orchestration system. For each feature, teams had to decide whether the changes should be implemented in only one system or in both. In many cases, this led to twice as much work in order to maintain a consistent user experience across both systems. In other cases, features were simply deferred or redesigned to avoid duplication of effort.</p><p id="b09b" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">Finally, special care had to be taken to ensure that both systems behaved in the way that our guests and hosts expected. Ideally, guests and hosts wouldn’t even notice the difference apart from some improvements in performance. Additional tooling and workflows were created to ensure that Airbnb’s support ambassadors continued to provide a consistent experience for our guests and hosts regardless of which system was used to orchestrate payments.</p><p id="7ed3" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">One key learning from this experience was how critical it is to communicate with all stakeholders to ensure that everyone is aligned on timelines, constraints, and priorities. Maintaining two parallel systems over an extended period of time creates a lot of overhead and slows down iteration speeds for new features. It is vital to ensure that the broader organization is aligned on the timeline so that product teams aren’t unnecessarily slowed down by unexpected work related to a partially migrated system. Splitting the migration into phases helps reduce the time during which teams are impacted.</p><h1 id="9182" class="ke kf hg cg kg kh ki kj kk kl km kn ko im kp in kq ip kr iq ks is kt it ku kv et">Commitment to Craft</h1><p id="b114" class="pw-post-body-paragraph iv iw hg ix b iy kw ih ja jb kx ik jd je ky jg jh ji kz jk jl jm la jo jp jq gz et">Perhaps the most important part of the migration process was ensuring that the new system was built with Airbnb’s <a class="au jr" rel="noopener" href="https://medium.com/airbnb-engineering/commitment-to-craft-e36d5a8efe2a">Commitment to Craft</a> in mind and thoroughly validated before being rolled out. A dedicated team of quality assurance engineers performed comprehensive manual testing of hundreds of scenarios to help to ensure consistency with the legacy system across a wide spectrum of use cases. In addition, an extensive set of unit tests, shallow integration tests, and end-to-end integration tests were created across the entire payments engineering organization to ensure the correct behavior of key payment flows. As an additional safeguard, whenever possible, asynchronous “matchup” jobs would compare the new data model and the old data model to validate that both codepaths produced consistent results.</p><h1 id="5e2f" class="ke kf hg cg kg kh ki kj kk kl km kn ko im kp in kq ip kr iq ks is kt it ku kv et">Conclusion</h1><p id="2aa6" class="pw-post-body-paragraph iv iw hg ix b iy kw ih ja jb kx ik jd je ky jg jh ji kz jk jl jm la jo jp jq gz et">Payments systems are complex. Taking the time to thoughtfully redesign the system can lead to improvements in maintainability, extensibility, performance, and resiliency. However, there are also noteworthy disadvantages to a long-lived migration process. The process can lead to uncertainty among clients of the service and consume resources that might otherwise be spent creating new features or optimizing existing flows. It is possible to mitigate some of these concerns by dividing the migration into smaller, well-defined milestones and ensuring regular communication with stakeholders. A thorough testing and validation plan is vital for ensuring that the new service can seamlessly replace legacy systems. By following this approach, we were able to launch a new payment orchestration system that is faster, easier to maintain, and can more easily support new products, features, and business needs.</p><p id="9011" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">Watch the recording of the <a class="au jr" href="https://www.facebook.com/AirbnbTech/videos/airbnb-tech-talk-make-money-moves/349403395953262/" rel="noopener ugc nofollow" target="_blank">Make Money Moves tech talk</a> for a more in-depth discussion of the migration of payments services to SOA.</p></div><div class="o ct lw lx ly lz" role="separator"><div class="gz ha hb hc hd"><p id="92d9" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">If this type of work interests you, check out some of our related positions:</p><p id="00d9" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et"><a class="au jr" href="https://careers.airbnb.com/positions/3393082/?gh_src=5a0351831us" rel="noopener ugc nofollow" target="_blank">Senior Software Engineer, Payments</a> (San Francisco or Seattle)</p><p id="3966" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et"><a class="au jr" href="https://careers.airbnb.com/positions/3393185/?gh_src=3eaf43fe1us" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Payments</a> (San Francisco or Seattle)</p><p id="fcc8" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et"><a class="au jr" href="https://careers.airbnb.com/positions/2768475/" rel="noopener ugc nofollow" target="_blank">Manager, Engineering Payments Compliance</a> (Bangalore, India)</p><p id="b6ab" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et"><a class="au jr" href="https://careers.airbnb.com/positions/2925359/" rel="noopener ugc nofollow" target="_blank">Senior Software Engineer, Payments Compliance</a> (Bangalore, India)</p><p id="dfba" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et"><a class="au jr" href="https://careers.airbnb.com/positions/2773515/" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, Payments Compliance</a> (Bangalore, India)</p><p id="8615" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et"><a class="au jr" href="https://careers.airbnb.com/positions/3197040/" rel="noopener ugc nofollow" target="_blank">Software Engineer </a>— Cities (Bangalore, India)</p></div><div class="o ct lw lx ly lz" role="separator"><div class="gz ha hb hc hd"><h1 id="440f" class="ke kf hg cg kg kh me kj kk kl mf kn ko im mg in kq ip mh iq ks is mi it ku kv et">Acknowledgments</h1><p id="1db9" class="pw-post-body-paragraph iv iw hg ix b iy kw ih ja jb kx ik jd je ky jg jh ji kz jk jl jm la jo jp jq gz et">This migration has been a long journey that wouldn’t have been possible without the contributions of many people at Airbnb across several organizations including Payments, Hosting, Guest Experience, QA, and Finance. Too many people helped on this project to thank all of them here, but the authors would like to recognize Musaab At-Taras, Xuemei Bao, Ryan Bi, Abhijit Borude, Ben Bowler, Yizheng Cai, Jiaqi Chen, Haoran Cheng, Cynthia Adams, Pat Connors, David Cordoba, Chong Chung, Anqi Dai, Xinyue Deng, Rex Du, Ali Goksel, Ömer Faruk Gül, Jiajia Han, Jing Hao, Toland Hon, Jeremy Kane, Hide Kato, Fanchen Kong, Victoria Ku, Pasha Lahutski, Serena Li, Tina Li, Harry Liu, Michael Liu, Wenguo Liu, Yixia Mao, Elena Moskvichev, Eric Ning, Ika Ogeil, Christina Ou, Payut Pantawongdecha, Yixiao Peng, Yaritza Perez, Wentao Qi, Zachary Sabin, Rajen Shah, Patrick Shay, Bo Shi, Derek Shimozawa, Erika Stott, Huayan Sun, Sam Tang, Claire Thompson, Neo Tong, Alex Virrueta, Jing Wang, Bryan Wehner, Michel Weksler, Claudio Wilson, Xuanxuan Wu, Liang Xiao, Chao Xin, Serdar Yildirim, Hang Yuan, Brian Zhang, Yunfei Zhao, Jaclyn Zhong, and Linglong Zhu for their contributions over the lifetime of this project.</p></div><div class="o ct lw lx ly lz" role="separator"><div class="gz ha hb hc hd"><p id="ba29" class="pw-post-body-paragraph iv iw hg ix b iy iz ih ja jb jc ik jd je jf jg jh ji jj jk jl jm jn jo jp jq gz et">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</p></div></div></div></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/rebuilding-payment-orchestration-at-airbnb-341d194a781b</link>
      <guid>https://medium.com/airbnb-engineering/rebuilding-payment-orchestration-at-airbnb-341d194a781b</guid>
      <pubDate>Thu, 24 Feb 2022 20:47:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[My Journey to Airbnb — Lucius DiPhillips]]></title>
      <description><![CDATA[<header class="pw-post-byline-header fj fk fl fm fn fo fp fq fr fs l"><div class="o u"><div class="o"><div class="eb l"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@airbnbeng?source=post_page-----79d1f0bc72a2-----------------------------------"><div class="l cj"><img alt="AirbnbEng" class="l dl ed ft fu" src="https://miro.medium.com/fit/c/96/96/1*PrgppbVAePgtuFs2XZa8Ig.jpeg" width="48" height="48" /></div></a></div><div class="l"><div class="fv o fw"><div><div class="ck" role="tooltip" aria-hidden="false"><p class="pw-author cg b ch ci et"><a class="au av aw ax ay az ba bb bc bd be bf bg bh bi" rel="noopener follow" href="https://medium.com/@airbnbeng?source=post_page-----79d1f0bc72a2-----------------------------------">AirbnbEng</a></p><div class="fx fy fz ga gb d" aria-hidden="true">·</div><div class="fy fz ga gb d"><div class="cg b ch ci ev"><a class="ev gc aw ax ay az ba bb bc bd gd ge bg gf gg" rel="noopener follow" href="https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2F_%2Fsubscribe%2Fuser%2Febe93072cafd&amp;operation=register&amp;redirect=https%3A%2F%2Fmedium.com%2Fairbnb-engineering%2Fmy-journey-to-airbnb-lucius-diphillips-79d1f0bc72a2&amp;user=AirbnbEng&amp;userId=ebe93072cafd&amp;source=post_page-ebe93072cafd----79d1f0bc72a2---------------------follow_sidebar--------------">Follow</a></div></div></div></div></div><div class="o ao gh"><p class="pw-published-date cg b ej ek bw">Feb 17</p><div class="fx ck" aria-hidden="true">·</div><div class="pw-reading-time cg b ej ek bw">6 min read</div></div></div></div><div class="gi gj gk gl gm d"><div class="ck" aria-hidden="false"></div></div></div></header><section><div><div class="gz ha hb hc hd"><div class=""><figure class="fk fm ie if ig ih fg fh paragraph-image"><div role="button" tabindex="0" class="ii ij cj ik dq il"><div class="fg fh id"><img alt="" class="dq im in" src="https://miro.medium.com/max/1400/1*hnM9txCDUVpdDN-u9Z9lJQ.jpeg" width="700" height="442" role="presentation" /></div></div></figure><p id="899e" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">Airbnb’s CIO on sponsorship, belonging, and the power of human connection</p><p id="dfc6" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et"><em class="jm">Lucius DiPhillips is the Chief Information Officer (CIO) at Airbnb. He has over 20 years of experience that spans Product Development, Information Technology, Customer Service, Financial Services, Payments, eCommerce, and Trust &amp; Safety. He has a Degree in Management Information Systems from Rensselaer Polytechnic Institute and serves as the executive sponsor for several diversity and belonging groups and initiatives across the company. Through his sponsorship, Lucius has been instrumental in helping to improve the ways in which Airbnb attracts and retains diverse technical talent.</em></p><h1 id="eb2f" class="jn jo hg cg jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk et">Breaking barriers and growing from adversity</h1><p id="81f9" class="pw-post-body-paragraph io ip hg iq b ir kl it iu iv km ix iy iz kn jb jc jd ko jf jg jh kp jj jk jl gz et">I grew up in Upstate New York, in a small town called Hudson Falls. I was raised by a single-parent mom, and I’m an only child. Growing up, we struggled financially, and I helped out wherever I could. From delivering papers as a ten-year-old to waiting tables as a high-schooler, I always had something to balance. That’s helped me as a leader to this day: balance and having that hard work ethic, and seeing my mom’s struggle as a single mom.</p><p id="3c34" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">I’m multiracial: my mom is white and my dad has Black heritage. At an early age, I became acutely aware that I was different, and sometimes I didn’t feel like I belonged — because of how I looked, because I didn’t have both parents in the picture, and because we didn’t have a lot financially. Rather than letting my differences hold me back, I immersed myself in many things, from playing sports to the school choir and musicals. I wanted to get to know a lot of different people and ultimately go beyond the superficial labels and barriers between us.</p><p id="8dda" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">That has become part of how I lead to this day, and how I build teams. It’s very much about connecting with people beyond what you might see on the surface, and really trying to find common ground. It was a skill that came from a dark place early on in my career, that has now become a skill that helps me be a coach, mentor, and sponsor, who invests in others and leads them to develop in their own careers.</p><h1 id="2820" class="jn jo hg cg jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk et">A career path shaped by curiosity, connections, and conversations</h1><p id="2571" class="pw-post-body-paragraph io ip hg iq b ir kl it iu iv km ix iy iz kn jb jc jd ko jf jg jh kp jj jk jl gz et">I first got interested in tech in the 90s, as a student curious about this new thing called the internet. While my career started in traditional IT, I later got into product development and software development for end-users. Normally, those are two different profiles, but I’m more of a hybrid with a broad understanding of all facets of tech. I love technology, but I also love operations, leadership, and people. I have an appreciation for how we make sure that we connect what’s happening with tech to real customers, real people at the end of the day.</p><p id="59ff" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">Making and maintaining many different personal connections with mentors and colleagues has led to a variety of opportunities in my career, and eventually brought me to Airbnb. I first came to Airbnb leading our Payments technology organization. Airbnb has a structured framework for career conversations that involves assessing your dream job, your story, your strengths, and what you want to do better, and from there identifying career development opportunities. This process led to my current role as the first CIO at Airbnb.</p><p id="c942" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">I feel like I have the best job as a CIO. I feel like I work for the best company in the world at Airbnb. That’s why I’m still glowing and here, going on four years later. And to me, this is just the beginning.</p><h1 id="0495" class="jn jo hg cg jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk et">Diversity and belonging in tech</h1><p id="b15d" class="pw-post-body-paragraph io ip hg iq b ir kl it iu iv km ix iy iz kn jb jc jd ko jf jg jh kp jj jk jl gz et">I am the co-sponsor of the Tech Diversity Council, a group of senior technical leaders at Airbnb tasked with amplifying and advocating for diversity-related projects and initiatives across our Tech org. It’s one of the most important roles that I have to play, if not the most important. And we’ve created the Council because we still have a long way to go in terms of representation across tech and across Airbnb. To me, the best way to get involved is to give my time and push ideas and vision into action that creates impact.</p><p id="40e7" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">There isn’t just one initiative to talk about here, but rather many parallel efforts that span from wide-scale to personal. In addition to the Tech Diversity Council, we have a hyperfocus on Black in Tech as a group, and we have the Black Sponsorship Program, developed and led by Airbnb’s Black Employee Resource Group, Black@. I lead a monthly series for anyone who’s in technology that self-identifies as Black, who optionally wants to come together, to have a safe place to share, to contribute.</p><p id="07fa" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">At Airbnb, we’ve always prided ourselves in standing for what we believe in unapologetically. With the Black Lives Matter movement and the George Floyd protests, I felt like we had the support to speak up about what we were feeling and the experiences we were having. The Black@ group created <a class="au kq" href="https://news.airbnb.com/activism-and-allyship-guide/" rel="noopener ugc nofollow" target="_blank">a guide</a> on how to be an ally, and we hosted many conversations about what it means to be Black in America. To me, that was the most important thing we could have been doing in that moment of time, and we continue doing it.</p><h1 id="c8ac" class="jn jo hg cg jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk et">Redesigning our hiring process</h1><p id="d503" class="pw-post-body-paragraph io ip hg iq b ir kl it iu iv km ix iy iz kn jb jc jd ko jf jg jh kp jj jk jl gz et">I’m proud of Airbnb, because most companies don’t even talk about where they hope they go. They don’t share representation numbers. We not only share them, we say we can do better, and we’re going to do better, and here’s how and when.</p><p id="5261" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">One of the things we’ve done is go back to the drawing board to redesign our hiring process. Having a love of operations, data, and driving process improvements, I approached our hiring process like a product. Step by step, we looked at the data and asked “Is there a disproportionate drop-off here for certain demographics? What can we do differently?”</p><p id="a0e1" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">We realized that engineers are very unique — not just in their gender and ethnicity and work experience, but in how they’re most comfortable interviewing. So we decided to give candidates more flexibility: they might choose either to do a take-home coding test or show us some open source work they’re proud of. We also got more managers involved at the ground level to support our diversity and inclusion candidates and help them feel seen, understood, and connected to their future team. Engaging more of our diverse engineers early on in the hiring process had a huge impact.</p><p id="9ea6" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">Starting from my time with the Payments organization, I recognized the urgency of the moment and the stakes at play for underrepresented candidates and pushed the recruiting team to put changes into action rather than waiting or holding back. I call it breaking some glass — you need to break some glass every once in a while to challenge the status quo.</p><h1 id="4107" class="jn jo hg cg jp jq jr js jt ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk et">A human-centric way to lead</h1><p id="bb26" class="pw-post-body-paragraph io ip hg iq b ir kl it iu iv km ix iy iz kn jb jc jd ko jf jg jh kp jj jk jl gz et">If you focus on belonging and engagement, and you make it a priority, then you can create a better environment for your team. When traumatic things happen, it’s important to educate others so they can be allies, as well as creating a safe space for people to share. As part of our wellness programs, we host “listening sessions.” I’ve hosted them with my leadership team, for Black@, for what was happening with the Afghan refugee situation, or when the COVID-19 pandemic was spiking in India.</p><p id="3822" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">I’m also passionate about demystifying the fact that work-life balance is a real thing you can actually conquer. It’s a pet peeve of mine that we talk about work-life balance. My balance is very different than any one of yours individually. So let’s talk about flexibility, and having empathy for each other’s unique needs and situations.</p><p id="89a5" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">I care about being transparent and available, and part of the way I do that is by offering an office hours slot twice a day that anyone can sign up for at any time. We actually took that idea and scaled it with something called coffee chats. Anyone in our organization can sign up to have a coffee chat with someone else. You don’t know who it’s going to be until you show up. And that’s what I loved about my office hour slots, being able to ask, “What’s your story? How long have you been here? What’s one personal thing you’re thinking about?” At heart I want to connect with people. I want to demonstrate a human-centric way to lead.</p><p id="6546" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">–</p><p id="4415" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et">Interested in working with Lucius at Airbnb? Check out these open roles:</p><p id="96a3" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et"><a class="au kq" href="https://careers.airbnb.com/positions/3897689/" rel="noopener ugc nofollow" target="_blank">Senior Engineering Manager, Tax Platform</a></p><p id="c3ee" class="pw-post-body-paragraph io ip hg iq b ir is it iu iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl gz et"><a class="au kq" href="https://careers.airbnb.com/positions/3873636/" rel="noopener ugc nofollow" target="_blank">Systems Engineer, Client Engineering</a></p></div></div></div></section>]]></description>
      <link>https://medium.com/airbnb-engineering/my-journey-to-airbnb-lucius-diphillips-79d1f0bc72a2</link>
      <guid>https://medium.com/airbnb-engineering/my-journey-to-airbnb-lucius-diphillips-79d1f0bc72a2</guid>
      <pubDate>Thu, 17 Feb 2022 20:54:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[The Past, Present, and Future of react-dates]]></title>
      <description><![CDATA[<div class=""><p id="e9d3" class="hh hi gk hj b hk hl hm hn ho hp hq hr hs ht hu hv hw hx hy hz ia ib ic id ie gd ez"><a class="au if" href="https://www.linkedin.com/in/kodiane/" rel="noopener ugc nofollow" target="_blank">Diane Ko</a></p><figure class="ih ii ij ik il im fo fp paragraph-image"><div role="button" tabindex="0" class="in io cj ip dp iq"><div class="fo fp ig"><img alt="" class="dp ir is" src="https://miro.medium.com/max/1400/1*VymqCVttV2_VOqmmApgakw.jpeg" width="700" height="467" role="presentation" /></div></div></figure><p id="2da4" class="hh hi gk hj b hk hl hm hn ho hp hq hr hs ht hu hv hw hx hy hz ia ib ic id ie gd ez">In 2016, Airbnb released react-dates, a React date picker component library. The <a class="au if" href="https://github.com/airbnb/react-dates/stargazers" rel="noopener ugc nofollow" target="_blank">project has amassed more than 11,000 stars</a>. GitHub also tells us that <a class="au if" href="https://github.com/airbnb/react-dates/network/dependents" rel="noopener ugc nofollow" target="_blank">react-dates is used by over 30,000 repos</a>.</p><p id="ae28" class="hh hi gk hj b hk hl hm hn ho hp hq hr hs ht hu hv hw hx hy hz ia ib ic id ie gd ez">In more recent years, Airbnb’s requirements for a date picker have changed in a way that has diverged from react-dates. If we were to have made those changes to the library, it would have severely limited the flexibility of the library, one of its key features. To better support the react-dates community, we’ve made the decision to transfer ownership of the react-dates repo to a new <a class="au if" href="https://github.com/react-dates" rel="noopener ugc nofollow" target="_blank">react-dates GitHub organization</a>. We believe this new home will better serve the community and continue to evolve the original goals of react-dates.</p><p id="8a7b" class="hh hi gk hj b hk hl hm hn ho hp hq hr hs ht hu hv hw hx hy hz ia ib ic id ie gd ez">If you want to help react-dates grow, please check out the <a class="au if" href="https://github.com/airbnb/react-dates/issues" rel="noopener ugc nofollow" target="_blank">open issues</a> and <a class="au if" href="https://github.com/airbnb/react-dates/pulls" rel="noopener ugc nofollow" target="_blank">pull requests</a> — the <a class="au if" href="https://github.com/airbnb/react-dates/labels/pull%20request%20wanted" rel="noopener ugc nofollow" target="_blank">“pull request wanted” tag</a> is a great starting point.</p></div>]]></description>
      <link>https://medium.com/airbnb-engineering/the-past-present-and-future-of-react-dates-b351ab739d3f</link>
      <guid>https://medium.com/airbnb-engineering/the-past-present-and-future-of-react-dates-b351ab739d3f</guid>
      <pubDate>Fri, 21 Jan 2022 18:45:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Intelligent Automation Platform: Empowering Conversational AI and Beyond at Airbnb]]></title>
      <description><![CDATA[<div class=""><div class="em"><div class="n en eo ep eq"><div class="o n"><div><a rel="noopener follow" href="https://medium.com/@xuzhihengtoutou?source=post_page-----869c44833ff2-----------------------------------"><img alt="Zhiheng Xu" class="s er es et" src="https://miro.medium.com/fit/c/96/96/1*CPhehvIDylFnFyXw3x5bWQ.jpeg" width="48" height="48" /></a></div><div class="eu aj s"><div class="n"><div><div class="ew n o ex"><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/@xuzhihengtoutou?source=post_page-----869c44833ff2-----------------------------------">Zhiheng Xu</a><div class="eu n"><div class="fv s"><div><div><div class="ft" role="tooltip" aria-hidden="false"><div class="s"></div></div></div></div></div></div></div></div><div><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/airbnb-engineering/intelligent-automation-platform-empowering-conversational-ai-and-beyond-at-airbnb-869c44833ff2?source=post_page-----869c44833ff2-----------------------------------">Jan 11</a> · 9 min read</div></div></div><div class="n gi gj gk gl gm gn go gp z"><div class="n o"><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div><p id="d6fb" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">How Intelligent Automation Platform supports conversational AI and agent-automation to improve the Airbnb customer experience</p><p id="d7b3" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">By <a class="bu hs" href="https://www.linkedin.com/in/zhiheng-xu-50249b31/" rel="noopener ugc nofollow" target="_blank">Zhiheng Xu</a>, <a class="bu hs" href="https://www.linkedin.com/in/yi-alex-zhou-1284651b/" rel="noopener ugc nofollow" target="_blank">Alex Zhou</a>, <a class="bu hs" href="https://www.linkedin.com/in/chutianwang/" rel="noopener ugc nofollow" target="_blank">Jeremy Wang</a>, <a class="bu hs" href="https://www.linkedin.com/in/zecheng-xu-11bb778a/" rel="noopener ugc nofollow" target="_blank">Zecheng Xu</a>, <a class="bu hs" href="https://www.linkedin.com/in/ziyi-wang-6651b5b1/" rel="noopener ugc nofollow" target="_blank">Ziyi Wang</a>, <a class="bu hs" href="https://www.linkedin.com/in/jiayu-lou-337ba785/" rel="noopener ugc nofollow" target="_blank">Jiayu Lou</a>, <a class="bu hs" href="https://www.linkedin.com/in/liuming-zhang-4b120894/" rel="noopener ugc nofollow" target="_blank">Liuming Zhang</a>, <a class="bu hs" href="https://www.linkedin.com/in/fengjian-pan/" rel="noopener ugc nofollow" target="_blank">Gary Pan</a>, Jeffrey Zhao, Yisong Wang, <a class="bu hs" href="https://www.linkedin.com/in/priyanksinghal/" rel="noopener ugc nofollow" target="_blank">Priyank Singhal</a>, <a class="bu hs" href="https://www.linkedin.com/in/clairexiong/" rel="noopener ugc nofollow" target="_blank">Claire Xiong</a>, <a class="bu hs" rel="noopener" href="https://medium.com/@waynezhang511">Wayne Zhang</a>, <a class="bu hs" href="https://www.linkedin.com/in/benmatata2020/" rel="noopener ugc nofollow" target="_blank">Ben Ma</a>, <a class="bu hs" href="https://www.linkedin.com/in/hao-wang-2661553/" rel="noopener ugc nofollow" target="_blank">Hao Wang</a>, Carter Appleton, Anthony Clifton</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ht"><img alt="" class="aj if ig" src="https://miro.medium.com/max/1400/0*-OP5Y1xe4uxuNzvn" width="700" height="468" role="presentation" /></div></div></figure><p id="9ed8" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">With the rapid development of Machine Learning and Natural Language Processing technologies, conversational AI has attracted huge attention in recent years. More and more conversational AI applications such as virtual assistants, smart speakers, and customer support chatbots have been developed to help people in their daily lives.</p><p id="d823" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">At Airbnb, we have developed multiple conversational AI products to enhance our host and guest experience. Examples include our <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/using-chatbots-to-provide-faster-covid-19-community-support-567c97c5c1c9">chatbot systems</a>, which support users through in-app messaging or automated phone calls, our <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/task-oriented-conversational-ai-in-airbnb-customer-support-5ebf49169eaa">task-oriented ML framework</a> for issue detection and automatic problem solving, and various on-trip support products to proactively help guests improve their experience while they are on trip.</p><p id="4090" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In this blog post, we introduce the <strong class="gw ih"><em class="ii">Intelligent Automation Platform</em></strong> (AP), a generic enterprise-level platform developed by Airbnb to support a suite of conversational AI products. From this point forward, the Intelligent Automation Platform will be referenced as “AP”.</p><p id="7ced" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">By modeling Conversational AI products as <a class="bu hs" href="https://en.wikipedia.org/wiki/Markov_decision_process" rel="noopener ugc nofollow" target="_blank">Markov Decision Process</a> (MDP) workflows, AP provides a unified representation of workflows and actions to facilitate workflow consolidation and action reusability. Additionally, the platform offers a GUI development tool to enable drag-and-drop workflow creation, facilitate fast iteration of products, and empower non-technical teams to build conversational AI products.</p><h1 id="d9d9" class="ij ik do bo il im in gz io ip iq hd ir is it iu iv iw ix iy iz ja jb jc jd je el"><strong class="ca">1. Platform Architecture</strong></h1><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jf"><div class="jl s ic jm"><div class="jn jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/0*UIcZmMClcOTgHDDW" width="700" height="537" srcset="https://miro.medium.com/max/552/0*UIcZmMClcOTgHDDW 276w, https://miro.medium.com/max/1104/0*UIcZmMClcOTgHDDW 552w, https://miro.medium.com/max/1280/0*UIcZmMClcOTgHDDW 640w, https://miro.medium.com/max/1400/0*UIcZmMClcOTgHDDW 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 1: AP Architecture</figcaption></div></figure><p id="a345" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Figure 1 shows the high-level architecture of AP, which consists of 4 main components:</p><ol class=""><li id="18e9" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jw jx jy el"><strong class="gw ih">Event Orchestrator</strong>, the event orchestration layer of the platform. It translates input/output messages between clients and Workflow Engine, to ensure that workflows on AP can be built and executed in a generic way.</li><li id="0758" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">Workflow Engine</strong>, the “brain” of the platform. It is responsible for managing and executing all the workflows powered by the platform.</li><li id="6e02" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">Action Store</strong>, the action execution engine of the platform. It supports action requests during workflow execution. Action Store is an open platform for developers to create new actions or reuse existing ones. By using actions in the Action Store, we standardize task execution based on different systems and backends, and ensure consistent user experience across different products.</li><li id="67d0" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">Flow Builder</strong>, the workflow creation GUI tool of the platform. It’s a collaborative, drag-and-drop interface that simplifies creation and management of workflows. The output of Flow Builder are workflows that can be loaded and executed by Workflow Engine.</li></ol><p id="24c0" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Figure 2 shows an example of a demo “Q &amp; A” workflow on AP. The demo workflow, configured via Flow Builder, can answer users’ questions from different channels (such as messaging or phone).</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jf"><div class="jl s ic jm"><div class="ke jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/0*FnBOBUiWhA2gjW8t" width="700" height="486" srcset="https://miro.medium.com/max/552/0*FnBOBUiWhA2gjW8t 276w, https://miro.medium.com/max/1104/0*FnBOBUiWhA2gjW8t 552w, https://miro.medium.com/max/1280/0*FnBOBUiWhA2gjW8t 640w, https://miro.medium.com/max/1400/0*FnBOBUiWhA2gjW8t 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 2: A Demo Q&amp;A Workflow on AP</figcaption></div></figure><p id="d5e1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">When the platform receives a request for the “Q &amp; A” workflow, it triggers:</p><ol class=""><li id="2ff7" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jw jx jy el">Event Orchestrator to normalize the request and find the corresponding workflow session if it exists (a workflow session is a single instance of the workflow), and then forward the request to Workflow Engine.</li><li id="94fb" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el">Workflow Engine to restore the previous state of the workflow or create a new one from the start node (state), and then execute the workflow: a) Execute the actions defined for the current workflow state. b) Move the workflow to the next state based on the action results or other conditions. c) Pause the workflow and wait for the next input if needed.</li><li id="c7ff" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el">Action Store to execute all the actions required by Workflow Engine.</li></ol><h1 id="1a5a" class="ij ik do bo il im in gz io ip iq hd ir is it iu iv iw ix iy iz ja jb jc jd je el">2. Key Components of Intelligent Automation Platform</h1><h2 id="ec35" class="kf ik do bo il kg kh ki io kj kk kl ir km kn ko iv kp kq kr iz ks kt ku jd kv el">2.1 Event Orchestrator</h2><p id="97f6" class="gu gv do gw b gx kw gz ha hb kx hd he hf ky hh hi hj kz hl hm hn la hp hq hr dg el">One of the design principles of AP is to provide channel-agnostic problem solving capabilities (channels represent the source of requests, such as in-app chatbot or phone). Workflows and Actions are intended to be channel-agnostic, focusing on the core of the problem no matter which channel users choose to contact us via to resolve their issues.</p><p id="099a" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><strong class="gw ih"><em class="ii">Event Orchestrator</em></strong> is the event orchestration layer of AP. It normalizes the input and output of the platform to ensure that conversational workflows can be built and executed in a channel-agnostic way. Figure 3 provides the architecture of Event Orchestrator, which contains 3 layers: orchestration layer, context data layer, and workflow request layer.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jf"><div class="jl s ic jm"><div class="lb jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/0*B_Xp_EsngxICFpgh" width="700" height="459" srcset="https://miro.medium.com/max/552/0*B_Xp_EsngxICFpgh 276w, https://miro.medium.com/max/1104/0*B_Xp_EsngxICFpgh 552w, https://miro.medium.com/max/1280/0*B_Xp_EsngxICFpgh 640w, https://miro.medium.com/max/1400/0*B_Xp_EsngxICFpgh 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 3: Event Orchestrator Architecture</figcaption></div></figure><p id="8ffc" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The orchestration layer handles all the requests and responses. It currently supports 3 types of input:</p><ol class=""><li id="fe53" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jw jx jy el"><strong class="gw ih">Channel message</strong>. These are messages delivered from different channels, such as phone, email or in-app messaging.</li><li id="de6b" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">Async events</strong>. These are async events (such as <a class="bu hs" href="https://kafka.apache.org/intro" rel="noopener ugc nofollow" target="_blank">Kafka</a> events) generated by different Airbnb internal systems, like cancellation events.</li><li id="c347" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">Internal service requests</strong>. Event Orchestrator also provides a few endpoints to handle workflow requests from other Airbnb internal services directly.</li></ol><p id="5846" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Context data layer stores all contextual information related to the platform requests. Before creating a workflow request to the <em class="ii">Workflow Engine</em>, context data layer: a) Identifies whether the request is about a new workflow session or an existing one, by looking up the session mapping tables. b) Restores critical contextual information for workflow execution by reading from session data tables.</p><p id="b440" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Workflow request layer prepares the request to <em class="ii">Workflow Engine</em> for workflow execution and processes the response from <em class="ii">Workflow Engine</em>. It makes sure that platform requests from different sources are converted into the same Workflow Engine requests so that <em class="ii">Workflow Engine</em> can handle all workflows in a generic way.</p><h2 id="6bf7" class="kf ik do bo il kg kh ki io kj kk kl ir km kn ko iv kp kq kr iz ks kt ku jd kv el">2.2 Workflow Engine</h2><p id="d5ef" class="gu gv do gw b gx kw gz ha hb kx hd he hf ky hh hi hj kz hl hm hn la hp hq hr dg el"><strong class="gw ih"><em class="ii">Workflow Engine</em></strong> is the “brain” of AP, responsible for executing and monitoring all the workflows powered by the platform.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jf"><div class="jl s ic jm"><div class="lc jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/0*KCemaFJ_3_SHp3s5" width="700" height="642" srcset="https://miro.medium.com/max/552/0*KCemaFJ_3_SHp3s5 276w, https://miro.medium.com/max/1104/0*KCemaFJ_3_SHp3s5 552w, https://miro.medium.com/max/1280/0*KCemaFJ_3_SHp3s5 640w, https://miro.medium.com/max/1400/0*KCemaFJ_3_SHp3s5 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 4: Workflow Engine Architecture</figcaption></div></figure><p id="508c" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Figure 4 shows the overall architecture of Workflow Engine, which consists of 4 main components:</p><ol class=""><li id="54b1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jw jx jy el"><strong class="gw ih">Session Manager</strong>. Session Manager manages the lifecycle of entire workflow execution. After receiving a workflow execution request, it will restore the previous state of the workflow (if a workflow is resumed) or create a new workflow from the start state (if a new workflow is created). When workflow needs to pause and wait for user response, Session Manager will store the current state and all workflow variables into the database, to be restored by the next request of the same session.</li><li id="ff28" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">Schema Loader</strong>. Schema Loader loads the workflow schema generated by <em class="ii">Flow Builder</em>, the workflow creation UI tool of AP. A workflow schema is a JSON schema file automatically generated by <em class="ii">Flow Builder </em>(see more details in the <em class="ii">Flow Builder</em> section).</li><li id="b4f9" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">Workflow Executor</strong>. Workflow Executor executes the workflow based on the workflow schema, starting from the current state of the workflow. It processes the action defined in the current state by sending a request to the <em class="ii">Action Store</em>, handles the response, and saves the variables to the Variable Manager. After that, it moves the workflow to the next state according to the transition conditions and starts processing the next workflow state. The Workflow Executor will keep repeating the process until the workflow needs to be paused (and waiting for user response), or until it reaches the end of the workflow.</li><li id="389b" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">Variable Manager</strong>. Variables are the data supporting workflow execution. Variable Manager manages all the variables and is accessible by Workflow Executor to read and update variables during workflow execution.</li></ol><h2 id="bf0b" class="kf ik do bo il kg kh ki io kj kk kl ir km kn ko iv kp kq kr iz ks kt ku jd kv el">2.3 Action Store</h2><p id="2f16" class="gu gv do gw b gx kw gz ha hb kx hd he hf ky hh hi hj kz hl hm hn la hp hq hr dg el"><strong class="gw ih"><em class="ii">Action Store</em></strong> is the action execution engine of AP, supporting action execution requests from <em class="ii">Workflow Engine</em>. It is also an open platform for developers to create new actions or reuse existing ones. All actions in the Action Store are available on <em class="ii">Flow Builder</em> for creating workflows.</p><p id="0780" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">As shown in figure 5, all actions in the Action Store implement a common interface, so that they can be processed in the same way during action execution (by <em class="ii">Workflow Engine</em>) and workflow creation (by <em class="ii">Flow Builder</em>). An action can be as simple as fetching a user’s reservation data or as complicated as issue prediction, which might involve multiple machine learning models and feature generation pipelines.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jf"><div class="jl s ic jm"><div class="ld jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/0*mGHSc3l3YH9cd9GJ" width="700" height="263" srcset="https://miro.medium.com/max/552/0*mGHSc3l3YH9cd9GJ 276w, https://miro.medium.com/max/1104/0*mGHSc3l3YH9cd9GJ 552w, https://miro.medium.com/max/1280/0*mGHSc3l3YH9cd9GJ 640w, https://miro.medium.com/max/1400/0*mGHSc3l3YH9cd9GJ 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 5: Action Interface</figcaption></div></figure><p id="b9a8" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Figure 6 shows the high level Architecture of Action Store, which contains 3 main components:</p><ol class=""><li id="fd91" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jw jx jy el"><strong class="gw ih">Action Executor</strong>. Action Executor supports action execution requests. When receiving a request, Action Executor will load the action implementation from Action Manager based on the action type and invoke the execution function defined in the implementation. Many actions rely on external services to finish the execution, and the Action Executor will be responsible for sending those external requests and processing the response.</li><li id="53f3" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">ActionInfo Handler</strong>. ActionInfo Handler supports <em class="ii">Flow Builder</em> for workflow creation by serializing all the action information (e.g., metadata, payload, results, etc.) to <em class="ii">Flow Builder</em> to render the actions on the UI and support action configuration when creating workflows. More details are available in the <em class="ii">Flow Builder</em> section.</li><li id="2317" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el"><strong class="gw ih">Action Manager</strong>. Action Manager registers and manages all the actions created in the Action Store. It provides action implementation to Action Executor and ActionInfo Handler based on the action type.</li></ol><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jf"><div class="jl s ic jm"><div class="le jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/0*0upy1aVoDkIG8tVJ" width="700" height="330" srcset="https://miro.medium.com/max/552/0*0upy1aVoDkIG8tVJ 276w, https://miro.medium.com/max/1104/0*0upy1aVoDkIG8tVJ 552w, https://miro.medium.com/max/1280/0*0upy1aVoDkIG8tVJ 640w, https://miro.medium.com/max/1400/0*0upy1aVoDkIG8tVJ 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 6: Action Store Architecture</figcaption></div></figure><h2 id="f7b3" class="kf ik do bo il kg kh ki io kj kk kl ir km kn ko iv kp kq kr iz ks kt ku jd kv el">2.4 Flow Builder</h2><p id="af4c" class="gu gv do gw b gx kw gz ha hb kx hd he hf ky hh hi hj kz hl hm hn la hp hq hr dg el"><strong class="gw ih"><em class="ii">Flow Builder</em></strong> is the workflow creation UI tool of AP, supporting drag-and-drop workflow creation. It integrates with <em class="ii">Action Store</em> to retrieve all action information and sends the generated workflow schema to <em class="ii">Workflow Engine</em> during workflow execution.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jf"><div class="jl s ic jm"><div class="lf jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/0*I0WVsWNOe_UV_C1f" width="700" height="355" srcset="https://miro.medium.com/max/552/0*I0WVsWNOe_UV_C1f 276w, https://miro.medium.com/max/1104/0*I0WVsWNOe_UV_C1f 552w, https://miro.medium.com/max/1280/0*I0WVsWNOe_UV_C1f 640w, https://miro.medium.com/max/1400/0*I0WVsWNOe_UV_C1f 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 7: Flow Builder UI (Action Configuration)</figcaption></div></figure><p id="8ee7" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Figure 7 illustrates the UI of Flow Builder when configuring actions in workflow. On the left is the Action Panel, which lists all available actions in the <em class="ii">Action Store</em> and supports searching by action name or description. Workflow creators can drag and drop any actions in the workflow panel and then configure the action payload by clicking the action node.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jf"><div class="jl s ic jm"><div class="lg jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/0*9_Fdqmq1D3vyGtLA" width="700" height="303" srcset="https://miro.medium.com/max/552/0*9_Fdqmq1D3vyGtLA 276w, https://miro.medium.com/max/1104/0*9_Fdqmq1D3vyGtLA 552w, https://miro.medium.com/max/1280/0*9_Fdqmq1D3vyGtLA 640w, https://miro.medium.com/max/1400/0*9_Fdqmq1D3vyGtLA 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 8: Flow Builder UI (Configure the Workflow Graph)</figcaption></div></figure><p id="9ffd" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Figure 8 shows the UI when configuring the workflow graph. Workflow creators can create transitions between workflow nodes (each node can be viewed as a step or state of the workflow) by creating links between nodes and configuring the transition conditions. After all the workflow nodes and links are configured, the workflow is ready to be tested and published.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jf"><div class="jl s ic jm"><div class="lh jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/0*5cY6rYs24KtGgbda" width="700" height="329" srcset="https://miro.medium.com/max/552/0*5cY6rYs24KtGgbda 276w, https://miro.medium.com/max/1104/0*5cY6rYs24KtGgbda 552w, https://miro.medium.com/max/1280/0*5cY6rYs24KtGgbda 640w, https://miro.medium.com/max/1400/0*5cY6rYs24KtGgbda 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 9: Flow Builder Architecture</figcaption></div></figure><p id="a637" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Figure 9 is the high level architecture of Flow Builder. It contains two major components:</p><ol class=""><li id="1523" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jw jx jy el">The frontend layer, which is built with a third-party library <a class="bu hs" href="https://github.com/projectstorm/react-diagrams" rel="noopener ugc nofollow" target="_blank">React-diagrams</a>, supports the UI and all operations on the UI.</li><li id="cfa0" class="gu gv do gw b gx jz gz ha hb ka hd he hf kb hh hi hj kc hl hm hn kd hp hq hr jw jx jy el">The backend layer, Workflow Management service, which is responsible for: a) Getting all action information from the <em class="ii">Action Store</em> and passing to the frontend layer. b) Generating workflow schema that can be executed by <em class="ii">Workflow Engine</em> from the configured workflow graph on the UI. c) Serving the workflow schema to <em class="ii">Workflow Engine</em> during workflow execution.</li></ol><p id="fea7" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Figure 10 gives an example of an auto-generated workflow schema that can be executed by <em class="ii">Workflow Engine</em>.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct li"><div class="jl s ic jm"><div class="lj jo s"><div class="jg jh t u v ji aj at jj jk"><div class="ftr-noscript"><img alt="" class="t u v ji aj" src="https://miro.medium.com/max/1400/1*h451jjvwaI49jZTduC8Nwg.png" width="700" height="522" srcset="https://miro.medium.com/max/552/1*h451jjvwaI49jZTduC8Nwg.png 276w, https://miro.medium.com/max/1104/1*h451jjvwaI49jZTduC8Nwg.png 552w, https://miro.medium.com/max/1280/1*h451jjvwaI49jZTduC8Nwg.png 640w, https://miro.medium.com/max/1400/1*h451jjvwaI49jZTduC8Nwg.png 700w" role="presentation" /></div></div></div></div></div><figcaption class="js jt cu cs ct ju jv bo b ev bq br">Figure 10: Example of Auto Generated Workflow Schema</figcaption></div></figure><h1 id="c729" class="ij ik do bo il im in gz io ip iq hd ir is it iu iv iw ix iy iz ja jb jc jd je el">3. Conclusion</h1><p id="27e7" class="gu gv do gw b gx kw gz ha hb kx hd he hf ky hh hi hj kz hl hm hn la hp hq hr dg el">In this post, we introduced our Intelligent Automation Platform, a generic and business friendly enterprise platform to support a suite of conversational AI products at Airbnb including chatbots for customers, on-trip support products, and agent automations. With Intelligent Automation Platform, we can simplify and speed up conversational AI product development, democratize AI technology to business teams, and scale up more and more intelligent solutions to improve the Airbnb customer experience.</p><h1 id="afe7" class="ij ik do bo il im in gz io ip iq hd ir is it iu iv iw ix iy iz ja jb jc jd je el">Acknowledgements</h1><p id="ae0f" class="gu gv do gw b gx kw gz ha hb kx hd he hf ky hh hi hj kz hl hm hn la hp hq hr dg el">Thanks to Danny Deng, Xirui Liu, Zixuan Yang, Xiang Lan, Keyao Yang, Changhui Liu, Wenbin Zhang, Hengyu Zhou, Stephanie Pang, Jack Chen, Bart Bu, Carter Appleton, Shahaf Abileah, Mariel Young, Shuo Zhang, Wei Ji, Jiayu Liu, Kevin Jungmeisteris, Pratik Shah, Xiaoyu Meng, Michael Zhou, Haoran Zhu, Jon Sandness and Conor D’Arcy for the product collaborations.</p><p id="dee6" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Thanks to Tina Su, Andy Yasutake, Joy Zhang, Raj Rajagopal, Navjot Sidhu, James Eby and Julian Warszawski’s leadership support for the Intelligent Automation Platform.</p><p id="78c5" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="ii">Interested in working at Airbnb? Check out this role:</em> <a class="bu hs" href="https://grnh.se/7de3db391us" rel="noopener ugc nofollow" target="_blank">Staff Software Engineer, CSP — Contact Solutions</a></p></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/intelligent-automation-platform-empowering-conversational-ai-and-beyond-at-airbnb-869c44833ff2</link>
      <guid>https://medium.com/airbnb-engineering/intelligent-automation-platform-empowering-conversational-ai-and-beyond-at-airbnb-869c44833ff2</guid>
      <pubDate>Tue, 11 Jan 2022 19:10:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Airbnb’s Page Performance Score on Android]]></title>
      <description><![CDATA[<div class=""><div class="em"><div class="n en eo ep eq"><div class="o n"><div><a rel="noopener follow" href="https://medium.com/@lupinglin?source=post_page-----f9fd5e733e-----------------------------------"><img alt="Luping Lin" class="s er es et" src="https://miro.medium.com/fit/c/96/96/0*oI_OkIEpE7Ob0IBd.jpg" width="48" height="48" /></a></div><div class="eu aj s"><div class="n"><div><div class="ew n o ex"><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/@lupinglin?source=post_page-----f9fd5e733e-----------------------------------">Luping Lin</a><div class="eu n"><div class="fv s"><div><div><div class="ft" role="tooltip" aria-hidden="false"><div class="s"></div></div></div></div></div></div></div></div><div><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/airbnb-engineering/airbnbs-page-performance-score-on-android-f9fd5e733e?source=post_page-----f9fd5e733e-----------------------------------">Dec 17</a> · 7 min read</div></div></div><div class="n gi gj gk gl gm gn go gp z"><div class="n o"><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div><p id="461d" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="hs">Part 4 of our series on </em><a class="bu ht" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936"><em class="hs">Airbnb’s Page Performance Score</em></a>.</p><p id="6b55" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><a class="bu ht" href="https://www.linkedin.com/in/lupinglin/" rel="noopener ugc nofollow" target="_blank">Luping Lin</a></p><figure class="hv hw hx hy hz ia cs ct paragraph-image"><div role="button" tabindex="0" class="ib ic id ie aj if"><div class="cs ct hu"><img alt="" class="aj ig ih" src="https://miro.medium.com/max/1400/0*jv0-M5bsGxi2bcXb" width="700" height="467" role="presentation" /></div></div></figure><p id="aa3f" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Airbnb’s home grown <a class="bu ht" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">Page Performance Score</a> (PPS) is designed to capture the rich, complex realities of performance by collecting a multitude of user-centric performance metrics and formulating them into one single 0-100 score. In this post we will deep dive into how we define and implement these metrics on Android. Make sure you read the <a class="bu ht" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">overview blog post</a> first to familiarize yourself with our PPS metrics and formula.</p><h1 id="2f91" class="ii ij do bo ik il im gz in io ip hd iq ir is it iu iv iw ix iy iz ja jb jc jd el">Instrumentation</h1><h2 id="55ff" class="je ij do bo ik jf jg jh in ji jj jk iq jl jm jn iu jo jp jq iy jr js jt jc ju el">Universal Page Tracking System</h2><p id="d50c" class="gu gv do gw b gx jv gz ha hb jw hd he hf jx hh hi hj jy hl hm hn jz hp hq hr dg el">The entire customer journey on Airbnb is divided into different pages, each of which has its own measured PPS. In order to support this page-based performance tracking system, we built a standardized infrastructure that enables engineers to configure pages representing their features.</p><p id="2468" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">On Android a page is associated with a <em class="hs">Fragment</em>. Each fragment must provide a <em class="hs">LoggingConfig</em> object specifying a page name, which can later be retrieved whenever the page name needs to be referenced. We collect performance data throughout the fragment’s lifecycle, and only emit the logging event when the fragment is paused.</p><p id="f5fd" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">A universal <em class="hs">PageName</em> enum is used to uniquely identify each page, and is referenced across all platforms to consistently represent each page in our user journey.</p><figure class="hv hw hx hy hz ia"><div class="ka s id"></div></figure><h2 id="298d" class="je ij do bo ik jf jg jh in ji jj jk iq jl jm jn iu jo jp jq iy jr js jt jc ju el">Capturing Wait Time Perceived by Users</h2><p id="54e4" class="gu gv do gw b gx jv gz ha hb jw hd he hf jx hh hi hj jy hl hm hn jz hp hq hr dg el">A key differentiator of our new Page Performance Score (PPS) is that it measures wait time that users can see. While our early measurement effort (mentioned in our <a class="bu ht" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">overview blog post</a>), which was based on the commonly known <a class="bu ht" href="https://web.dev/interactive/" rel="noopener ugc nofollow" target="_blank">Time To Interactive</a> (TTI) metric, measures code execution time and length of asynchronous calls. For example, PPS measures how long a user sees the loading indicators on screen, while TTI measures how long it takes for a network request to return results and how long it takes to build the view models. We believe PPS more closely reflects performance experienced by our users.</p><p id="8cef" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In order to capture visually perceived wait time, we needed all views with a loading state to implement an API that reports their loading state changes. We created a simple interface called <em class="hs">LoadableView</em>.</p><figure class="hv hw hx hy hz ia"><div class="ka s id"></div></figure><p id="f873" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">We provide primitives such as a base <em class="hs">ViewGroup</em>, a base <em class="hs">TextView</em>, and a base <em class="hs">ImageView,</em> all of which implement the <em class="hs">LoadableView</em> interface. Our developers simply need to inherit from these primitives for their views to be automatically instrumented.</p><p id="a4b1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">One challenge was that we needed to keep track of a view’s visibility because if a view is not at least 10% visible on the screen we don’t want to include its loading time in our measurement. The computation of the percentage of visibility of every view is both frequent and recursive. Furthermore, most of our views are in a <em class="hs">RecyclerView</em> and we must ensure their visibility is updated correctly on each scroll event, while keeping the <em class="hs">RecyclerView</em> performant. We devised algorithms to reduce the frequency and complexity of these calculations, including caching the visibility states within the <em class="hs">RecyclerView</em>.</p><h1 id="602b" class="ii ij do bo ik il im gz in io ip hd iq ir is it iu iv iw ix iy iz ja jb jc jd el">Metric Implementation</h1><h2 id="e640" class="je ij do bo ik jf jg jh in ji jj jk iq jl jm jn iu jo jp jq iy jr js jt jc ju el">Time to First Layout (TTFL)</h2><p id="bfe8" class="gu gv do gw b gx jv gz ha hb jw hd he hf jx hh hi hj jy hl hm hn jz hp hq hr dg el">TTFL measures how long a user has to wait before seeing <em class="hs">any</em> content on the screen. TTFL starts at fragment initialization and ends at the first <em class="hs">onGlobalLayout </em>event after the fragment is laid out, at which point the system has finished inflating, measuring, and laying out the fragment’s view hierarchy.</p><p id="aa1c" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">A slow TTFL often indicates that the fragment’s view hierarchy is overly complicated, or the UI thread is preoccupied with unnecessary tasks during fragment initialization.</p><h2 id="f0f6" class="je ij do bo ik jf jg jh in ji jj jk iq jl jm jn iu jo jp jq iy jr js jt jc ju el">Time to Initial Load (TTIL)</h2><p id="2754" class="gu gv do gw b gx jv gz ha hb jw hd he hf jx hh hi hj jy hl hm hn jz hp hq hr dg el">TTIL measures how long a user sees loading indicators (excluding media loading which is measured separately) before meaningful content is displayed on screen. TTIL starts at fragment initialization like TTFL, and ends when no more views on screen are in a loading state. If a screen (Fragment) is static or cached we don’t show a loading indicator. In that scenario TTIL would be the same as TTFL.</p><p id="f148" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">A slow TTIL often reveals opportunities in improving network latency or client rendering time. For network latency we look for slow backend services, large payloads, unutilized cache, or a less optimized data parser. For rendering time we try to follow best practices in using the RecyclerView, avoid doing heavy or recursive computation when building view models, and reduce over drawing, etc.</p><p id="5e59" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">As mentioned above, views with a loading state can inherit from base primitives with built-in <em class="hs">LoadableView</em> implementations. The API automatically reports the view’s loading state changes to our logging framework. We use a simple counter that increments when a view enters loading state and decrements when the data is loaded. When the counter is 0, we know that there are no more loading views on screen.</p><figure class="hv hw hx hy hz ia"><div class="ka s id"></div></figure><p id="388e" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="hs">This GIF demonstrates TTFL (marked when the gray background with the Airbnb logo is shown) and TTIL (marked when the loading dots are replaced by meaningful content).</em></p><figure class="hv hw hx hy hz ia cs ct paragraph-image"><div class="cs ct kd"><div class="ka s id kj"><div class="kk kc s"><div class="ke kf t u v kg aj at kh ki"><div class="ftr-noscript"><img alt="" class="t u v kg aj" src="https://miro.medium.com/max/1200/0*53ZcfBamEiTotrgi" width="600" height="1066" srcset="https://miro.medium.com/max/552/0*53ZcfBamEiTotrgi 276w, https://miro.medium.com/max/1104/0*53ZcfBamEiTotrgi 552w, https://miro.medium.com/max/1200/0*53ZcfBamEiTotrgi 600w" role="presentation" /></div></div></div></div></div></figure><h2 id="3d65" class="je ij do bo ik jf jg jh in ji jj jk iq jl jm jn iu jo jp jq iy jr js jt jc ju el">Main Thread Hangs (MTH)</h2><p id="9540" class="gu gv do gw b gx jv gz ha hb jw hd he hf jx hh hi hj jy hl hm hn jz hp hq hr dg el">Users experience screen freezes, lags, and stutters when ui frames take too long to render. Each android device has a target frame refresh rate based on the device’s capacity. However when the main thread is too busy, the device renders slower than the frame rate it’s capable of. We define a MTH as whenever any frame takes more than twice the system’s frame refresh rate to render.</p><p id="e705" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Frequent MTHs indicate that the main thread might be overloaded. Heavy operations or computations should be moved off the UI thread or delayed until contents are rendered.</p><p id="fd35" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">MTH is calculated using <a class="bu ht" href="https://developer.android.com/reference/android/view/FrameMetrics" rel="noopener ugc nofollow" target="_blank">FrameMetrics</a> reported by the Android system. We obtain the frame refresh rate from the system and use it to calculate the threshold for the thread hangs. We then listen for system callbacks to receive <a class="bu ht" href="https://developer.android.com/reference/android/view/FrameMetrics" rel="noopener ugc nofollow" target="_blank">FrameMetrics</a>, if the frame duration is above our threshold, we record the delta <em class="hs">(frameDuration - hangThreshold)</em> as a hang.</p><h2 id="51b7" class="je ij do bo ik jf jg jh in ji jj jk iq jl jm jn iu jo jp jq iy jr js jt jc ju el">Additional Load Time (ALT)</h2><p id="8b56" class="gu gv do gw b gx jv gz ha hb jw hd he hf jx hh hi hj jy hl hm hn jz hp hq hr dg el">ALT measures any wait time that occurs after the initial load, such as waiting for list paginations or for content to be updated after a Save button is pressed. ALT starts whenever a view enters the loading state <em class="hs">after</em> TTIL has already been marked, and ends when no more loading views are shown. ALT can start and end multiple times, each time is recorded as a separate ALT.</p><p id="0f45" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Opportunities to improve ALT often lie in predicting and prefetching additional content. The overall PPS can also be improved by balancing how much content to load in initial load vs additional loads.</p><p id="40ba" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="hs">This GIF demonstrates ALT (marked when the loading indicator at the bottom is replaced by paginated content loaded from the network).</em></p><figure class="hv hw hx hy hz ia cs ct paragraph-image"><div class="cs ct kd"><div class="ka s id kj"><div class="kk kc s"><div class="ke kf t u v kg aj at kh ki"><div class="ftr-noscript"><img alt="" class="t u v kg aj" src="https://miro.medium.com/max/1200/0*fptmsQJ6LfgBRQdS" width="600" height="1066" srcset="https://miro.medium.com/max/552/0*fptmsQJ6LfgBRQdS 276w, https://miro.medium.com/max/1104/0*fptmsQJ6LfgBRQdS 552w, https://miro.medium.com/max/1200/0*fptmsQJ6LfgBRQdS 600w" role="presentation" /></div></div></div></div></div></figure><h2 id="c63d" class="je ij do bo ik jf jg jh in ji jj jk iq jl jm jn iu jo jp jq iy jr js jt jc ju el">Rich Content Load Time (RCLT)</h2><p id="df6c" class="gu gv do gw b gx jv gz ha hb jw hd he hf jx hh hi hj jy hl hm hn jz hp hq hr dg el">RCLT measures how long a user sees a placeholder or a loading indicator until an image, a video, or some rich media content is fully displayed. <em class="hs">ImageView</em> and other rich media containers implement the same <em class="hs">LoadableView</em> API to report loading state changes to the PPS logger.</p><p id="0ae5" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">To improve RCLT, we look to reduce image size, improve image caching, optimize image formats and serving, strategically schedule loading rich content that is not yet on screen, and select performant streaming libraries, etc.</p><p id="b042" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="hs">This GIF demonstrates RCLT (marked when the place holders are replaced with actual images loaded from the network).</em></p><figure class="hv hw hx hy hz ia cs ct paragraph-image"><div class="cs ct kd"><div class="ka s id kj"><div class="kk kc s"><div class="ke kf t u v kg aj at kh ki"><div class="ftr-noscript"><img alt="" class="t u v kg aj" src="https://miro.medium.com/max/1200/0*BtqfXhapm7jDuKL9" width="600" height="1066" srcset="https://miro.medium.com/max/552/0*BtqfXhapm7jDuKL9 276w, https://miro.medium.com/max/1104/0*BtqfXhapm7jDuKL9 552w, https://miro.medium.com/max/1200/0*BtqfXhapm7jDuKL9 600w" role="presentation" /></div></div></div></div></div></figure><h1 id="540d" class="ii ij do bo ik il im gz in io ip hd iq ir is it iu iv iw ix iy iz ja jb jc jd el">Conclusion</h1><p id="22b0" class="gu gv do gw b gx jv gz ha hb jw hd he hf jx hh hi hj jy hl hm hn jz hp hq hr dg el">We successfully built an instrumentation framework on Android to capture much richer and user-centric performance metrics, guided by the same design principles in <a class="bu ht" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936"><em class="hs">Airbnb’s Page Performance Score</em></a> across web and native platforms. On top of this framework and the data collected, we built out dashboards to monitor performance across the entire app, set up automatic alerts targeting page owners, streamlined performance goal setting at team and org levels, and systematically tracked and mitigated performance regressions.</p><p id="635b" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In 2022, we plan to improve the granularity and accuracy of our instrumentations such as measuring tap responsiveness, better differentiating performance during scrolling, and providing primitives with built-in performance optimizations. We will also devote resources to build tooling to improve debuggability, and enable early regression detection and prevention via synthetic testing.</p><p id="7552" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">PPS gives our engineers and data scientists better insights and more ways to improve our products. It also strengthens our <a class="bu ht" rel="noopener" href="https://medium.com/airbnb-engineering/commitment-to-craft-e36d5a8efe2a">Commitment to Craft</a> culture. We hope that you apply these learnings in your organization as well.</p><h2 id="96cc" class="je ij do bo ik jf jg jh in ji jj jk iq jl jm jn iu jo jp jq iy jr js jt jc ju el">Appreciations</h2><p id="d851" class="gu gv do gw b gx jv gz ha hb jw hd he hf jx hh hi hj jy hl hm hn jz hp hq hr dg el">Thank you to everyone who has helped build PPS on Android: <a class="bu ht" href="https://www.linkedin.com/in/eli-hart-54a4b975/" rel="noopener ugc nofollow" target="_blank">Eli Hart</a>, <a class="bu ht" href="https://www.linkedin.com/in/charlesx2013/" rel="noopener ugc nofollow" target="_blank">Charles Xue</a>, <a class="bu ht" href="https://www.linkedin.com/in/nickbryanmiller/" rel="noopener ugc nofollow" target="_blank">Nick Miller</a>, <a class="bu ht" href="https://www.linkedin.com/in/scheuermann/" rel="noopener ugc nofollow" target="_blank">Andrew Scheuermann</a>, <a class="bu ht" href="https://www.linkedin.com/in/hdezninirola/" rel="noopener ugc nofollow" target="_blank">Antonio Niñirola</a>, <a class="bu ht" href="https://www.linkedin.com/search/results/all/?keywords=joshua%20nelson%20%E2%9C%A8&amp;origin=RICH_QUERY_SUGGESTION&amp;position=0&amp;searchId=959d4aca-c80e-448a-b415-4a732ba7a84d&amp;sid=Rr6" rel="noopener ugc nofollow" target="_blank">Josh Nelson</a>, <a class="bu ht" href="https://www.linkedin.com/in/adityapunjani/" rel="noopener ugc nofollow" target="_blank">Aditya Punjani</a>, <a class="bu ht" href="https://www.linkedin.com/in/joshpolsky/" rel="noopener ugc nofollow" target="_blank">Josh Polsky</a>, <a class="bu ht" href="https://www.linkedin.com/in/jnvollmer/" rel="noopener ugc nofollow" target="_blank">Jean-Nicolas Vollmer</a>, <a class="bu ht" href="https://www.linkedin.com/in/wensheng-mao-76ab7142/" rel="noopener ugc nofollow" target="_blank">Wensheng Mao</a> and everyone else who helped along the way.</p><p id="8971" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Interested in working at Airbnb? Check out these roles:<br /><a class="bu ht" href="https://grnh.se/6c9839421us" rel="noopener ugc nofollow" target="_blank">Staff Android Engineer</a><br /><a class="bu ht" href="https://grnh.se/1e5c9bf51us" rel="noopener ugc nofollow" target="_blank">Senior Android Engineer</a> <br /><a class="bu ht" href="https://grnh.se/aa366a2e1us" rel="noopener ugc nofollow" target="_blank">Senior Android Engineer</a><br /><a class="bu ht" href="https://grnh.se/20c296251us" rel="noopener ugc nofollow" target="_blank">Android Engineer, Special Projects</a></p></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/airbnbs-page-performance-score-on-android-f9fd5e733e</link>
      <guid>https://medium.com/airbnb-engineering/airbnbs-page-performance-score-on-android-f9fd5e733e</guid>
      <pubDate>Fri, 17 Dec 2021 22:06:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Automating Data Protection at Scale, Part 3]]></title>
      <description><![CDATA[<div class=""><div class="em"><div class="n en eo ep eq"><div class="o n"><div><a rel="noopener follow" href="https://medium.com/@lizzynammour?source=post_page-----34e592c45d46-----------------------------------"><img alt="elizabeth nammour" class="s er es et" src="https://miro.medium.com/fit/c/96/96/1*hN31NMaZ5de5fou4637BIw.jpeg" width="48" height="48" /></a></div><div class="eu aj s"><div class="n"><div><div class="ew n o ex"><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/@lizzynammour?source=post_page-----34e592c45d46-----------------------------------">elizabeth nammour</a><div class="eu n"><div class="fv s"><div><div><div class="ft" role="tooltip" aria-hidden="false"><div class="s"></div></div></div></div></div></div></div></div><div><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-3-34e592c45d46?source=post_page-----34e592c45d46-----------------------------------">Dec 16</a> · 12 min read</div></div></div><div class="n gi gj gk gl gm gn go gp z"><div class="n o"><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div><p id="0897" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Part three of a series on how we provide powerful, automated, and scalable data privacy and security engineering capabilities at Airbnb</p><p id="147f" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><a class="bu hs" href="https://www.linkedin.com/in/elizabethnammour/" rel="noopener ugc nofollow" target="_blank">Elizabeth Nammour</a>, <a class="bu hs" href="https://www.linkedin.com/in/pinyao-guo-6b621684/" rel="noopener ugc nofollow" target="_blank">Pinyao Guo</a>, Jamie Chong, <a class="bu hs" href="https://www.linkedin.com/in/wendy-jing-jin-81452921/" rel="noopener ugc nofollow" target="_blank">Wendy Jin</a></p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ht"><img alt="" class="aj if ig" src="https://miro.medium.com/max/1400/0*opUbxGQ8bUEHduhi" width="700" height="460" role="presentation" /></div></div></figure><h1 id="c252" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Introduction</h1><p id="1dac" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">In <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-1-c74909328e08">Part 1</a> and <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-2-c2b8d2068216">Part 2</a> of our blog series, we gave an overview of the Data Protection Platform (DPP). We focused on how we built a global understanding of Airbnb’s data and its associated security and privacy risks. In this blog post, we will describe how we use this understanding to provide powerful and automated security and privacy engineering capabilities and empower data governance. In order to reduce risk across the entire Airbnb organization, we sought to address the following concerns:</p><ul class=""><li id="f114" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr ji jj jk el"><strong class="gw jl">Accountability: </strong>Security and privacy compliance are not solely the responsibilities of security and privacy teams, but should be enabled across the Airbnb platform, development experience, product life cycles, and enterprise vendor solutions. As the volume of data grows and services become more complex, we need to hold the teams who control that data within Airbnb (“service owners”) accountable for the security and privacy of that data</li><li id="6e8c" class="gu gv do gw b gx jm gz ha hb jn hd he hf jo hh hi hj jp hl hm hn jq hp hq hr ji jj jk el"><strong class="gw jl">Minimal overhead: </strong>While service owners share the responsibility of reducing risks, we want to ensure we can automate the bulk of the work and minimize their operational load</li><li id="1dfc" class="gu gv do gw b gx jm gz ha hb jn hd he hf jo hh hi hj jp hl hm hn jq hp hq hr ji jj jk el"><strong class="gw jl">Global alignment: </strong>Not everyone has exactly the same understanding of data classification and protection strategies. We aim to reach a consensus among security, privacy, legal, and service owners and provide a single source of truth for privacy and security annotations and actions</li></ul><p id="17a9" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In the following sections, we’ll first share a deep dive into the Data Protection Service, which integrates all components of our DPP and enables us to define custom data protection jobs based on our findings. Then, we will demonstrate concrete use cases of how the DPP reduces security and privacy risks.</p><h1 id="d337" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Data Protection Service</h1><p id="7101" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">We built the Data Protection Service (DPS) to integrate all components of the DPP and automate security and privacy actions for stakeholders.</p><p id="21f8" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The DPS provides API endpoints to stakeholders or services outside the DPP, which allows them to query for privacy and security metadata stored in <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-1-c74909328e08">Madoka</a>. For example, we have an API endpoint that allows services to query for a list of data assets that contain any type of personal data. This enables downstream data services or pipelines to build their integrations.</p><p id="aaf1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The DPS also enables us to easily define custom “jobs” to automate specific steps, such as:</p><ul class=""><li id="6980" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr ji jj jk el"><strong class="gw jl">Creating JIRA notifications:</strong> In order to create JIRA tickets, the DPS uses an internal ticket generator that abstracts away the ticketing mechanisms and easily allows us to filter out any duplicate tickets. We just have to define a unique identifier for the findings so that no two tickets are filed for the same findings. JIRA is one of many ways to notify data owners. Slackbots, email notifications, and other internal vendor tools would also be feasible options.</li><li id="4a5e" class="gu gv do gw b gx jm gz ha hb jn hd he hf jo hh hi hj jp hl hm hn jq hp hq hr ji jj jk el"><strong class="gw jl">Generating pull requests (PRs):</strong> In order to create PRs in GitHub Enterprise (GHE), we created a wrapper around GHE’s APIs to easily clone a repo, create a PR, and get the status of a PR. Within each job, we implement the logic of how to modify the repo’s target files and add them to a PR.</li></ul><h1 id="032b" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Data Protection Annotation Validation</h1><p id="84b2" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">To help us comply effectively and efficiently with data privacy laws, we need to know where personal data lives along with its lifecycle. We also need to protect data as it propagates across different data stores and services. To help achieve this goal, we define three levels of data classification annotations — critical, personal, public — and tag the data with the annotations.</p><p id="6f46" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">At Airbnb, engineers and data scientists can define database-export pipelines to export online MySQL table snapshots to offline Hive tables for data analysis. We require owners to tag each table column with data classification annotations. Using these tags, we are able to segregate and further protect the most sensitive data categories with appropriate access controls and retention limits.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div class="cs ct jr"><div class="jx s ic jy"><div class="jz ka s"><div class="js jt t u v ju aj at jv jw"><div class="ftr-noscript"><img alt="" class="t u v ju aj" src="https://miro.medium.com/max/1216/1*2deebHEi0NYUvzwUL6yHBg.png" width="608" height="788" srcset="https://miro.medium.com/max/552/1*2deebHEi0NYUvzwUL6yHBg.png 276w, https://miro.medium.com/max/1104/1*2deebHEi0NYUvzwUL6yHBg.png 552w, https://miro.medium.com/max/1216/1*2deebHEi0NYUvzwUL6yHBg.png 608w" role="presentation" /></div></div></div></div><figcaption class="ke kf cu cs ct kg kh bo b ev bq br">Example of database exports definition</figcaption></div></figure><p id="4824" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Service owners use an extension of Thrift Interface Description Language (IDL) to define data interfaces for inter-service communication. We require each field within an endpoint to be tagged with a data classification annotation, which is used to restrict service API access from high risk locations. Annotations are also used to help evaluate the security and privacy risks of a service. Below is an example of a Thrift IDL API definition.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ki"><div class="jx s ic jy"><div class="kj ka s"><div class="js jt t u v ju aj at jv jw"><div class="ftr-noscript"><img alt="" class="t u v ju aj" src="https://miro.medium.com/max/1400/1*lgUJzzwvRbFvH8nq7RBvrg.png" width="700" height="262" srcset="https://miro.medium.com/max/552/1*lgUJzzwvRbFvH8nq7RBvrg.png 276w, https://miro.medium.com/max/1104/1*lgUJzzwvRbFvH8nq7RBvrg.png 552w, https://miro.medium.com/max/1280/1*lgUJzzwvRbFvH8nq7RBvrg.png 640w, https://miro.medium.com/max/1400/1*lgUJzzwvRbFvH8nq7RBvrg.png 700w" role="presentation" /></div></div></div></div></div><figcaption class="ke kf cu cs ct kg kh bo b ev bq br">Example Service IDL API definition</figcaption></div></figure><p id="d669" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">However, annotations relying on human judgment are prone to errors. Service owners might misjudge or be unaware of the fields within their API or data column and annotate the data incorrectly. For this reason, we validate the correctness of data classification annotations.</p><h1 id="5a2d" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Database Exports Validation</h1><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ht"><div class="jx s ic jy"><div class="kk ka s"><div class="js jt t u v ju aj at jv jw"><div class="ftr-noscript"><img alt="" class="t u v ju aj" src="https://miro.medium.com/max/1400/0*IGOZi0b-6ytJGFM4" width="700" height="413" srcset="https://miro.medium.com/max/552/0*IGOZi0b-6ytJGFM4 276w, https://miro.medium.com/max/1104/0*IGOZi0b-6ytJGFM4 552w, https://miro.medium.com/max/1280/0*IGOZi0b-6ytJGFM4 640w, https://miro.medium.com/max/1400/0*IGOZi0b-6ytJGFM4 700w" role="presentation" /></div></div></div></div></div><figcaption class="ke kf cu cs ct kg kh bo b ev bq br">Figure 1: Database Exports Data Classification Validation CI Check</figcaption></div></figure><p id="b712" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">To validate database-exports annotations, we created a CI check that leverages the DPS and runs whenever someone creates a database-exports PR.</p><p id="5491" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">For every column specified in the PR, the CI check does the following:</p><ol class=""><li id="5b31" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr kl jj jk el">Queries the DPS to determine what the privacy classification should be for that column. If the classification and the PR annotation don’t match, the CI check will fail.</li><li id="e518" class="gu gv do gw b gx jm gz ha hb jn hd he hf jo hh hi hj jp hl hm hn jq hp hq hr kl jj jk el">Otherwise, we run an additional set of regexes to determine what the data classification annotation of that column should be set to. This is mainly useful for tables that don’t contain any data, or in the case of false negatives.</li><li id="8bff" class="gu gv do gw b gx jm gz ha hb jn hd he hf jo hh hi hj jp hl hm hn jq hp hq hr kl jj jk el">If both of these checks pass, then the CI check passes.</li></ol><p id="51be" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The data warehouse also uses data classification results to validate annotations on already-checked-in database-export files. A daily job queries the DPS to fetch data classifications for all Hive tables. The job notifies service owners if the classifications and annotations don’t match. These incorrectly annotated tables will be automatically dropped if service owners do not take any actions.</p><h1 id="5f07" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">IDL Validation</h1><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ht"><div class="jx s ic jy"><div class="km ka s"><div class="js jt t u v ju aj at jv jw"><div class="ftr-noscript"><img alt="" class="t u v ju aj" src="https://miro.medium.com/max/1400/0*4GQjTy7YZ5GC99Sk" width="700" height="283" srcset="https://miro.medium.com/max/552/0*4GQjTy7YZ5GC99Sk 276w, https://miro.medium.com/max/1104/0*4GQjTy7YZ5GC99Sk 552w, https://miro.medium.com/max/1280/0*4GQjTy7YZ5GC99Sk 640w, https://miro.medium.com/max/1400/0*4GQjTy7YZ5GC99Sk 700w" role="presentation" /></div></div></div></div></div><figcaption class="ke kf cu cs ct kg kh bo b ev bq br">Figure 2: Service API Interface Data Language Validation</figcaption></div></figure><p id="67c1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">We leverage the traffic-capturing feature from Airbnb services to get request and response pairs for IDL APIs. <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-2-c2b8d2068216">Inspekt</a> periodically sends requests to each service to obtain traffic samples. Inspekt then scans and classifies the traffic samples into data elements. Madoka then collects the scanning results from Inspekt and determines if there is any discrepancy between them and the annotation tags. The scanning result classification is determined by the highest sensitivity of all detected data elements. For instance, if the scanning result contains a bank account number (high) and a mailing address (medium), the final classification will be high. The discrepancy will be pinpointed to the specific field(s) within the IDL definition.</p><p id="c575" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">When a discrepancy is found, the DPS creates a JIRA ticket and opens a PR for the service owner to fix the IDL annotations. The DPS locates the inconsistent field within the IDL annotation file and uses the GHE client to find the relevant contributor of the code. Then, it opens a PR with suggested changes and links to the PR within the created JIRA ticket.</p><h1 id="a779" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Privacy Data Subject Rights Orchestration</h1><p id="4c7a" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">With the evolution of privacy laws such as with the <a class="bu hs" href="https://gdpr-info.eu/" rel="noopener ugc nofollow" target="_blank">General Data Protection Regulation</a> and <a class="bu hs" href="https://leginfo.legislature.ca.gov/faces/codes_displayText.xhtml?division=3.&amp;part=4.&amp;lawCode=CIV&amp;title=1.81.5" rel="noopener ugc nofollow" target="_blank">California Consumer Privacy Act</a>, individuals are able to exert more choice and control over how their personal data is collected, stored, and used. Certain data protection laws grant individuals specific data subject rights in relation to their personal data. These include “the right to be forgotten,” which gives a user the right to ask to have their personal data erased, and the right of access, which gives a user the right to know and obtain certain information about the data that an organization holds about them.</p><h2 id="ad4c" class="kn ii do bo ij ko kp kq im kr ks kt ip ku kv kw it kx ky kz ix la lb lc jb ld el">Obliviate</h2><p id="f302" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">To help us comply effectively with these regulations, we built a Data Subjects Rights (DSR) orchestration service, called Obliviate, that helps coordinate and track DSR requests for erasure, access or portability from our users.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct le"><div class="jx s ic jy"><div class="lf ka s"><div class="js jt t u v ju aj at jv jw"><div class="ftr-noscript"><img alt="" class="t u v ju aj" src="https://miro.medium.com/max/1400/0*d1DzGO0rY1A5SQsh" width="700" height="372" srcset="https://miro.medium.com/max/552/0*d1DzGO0rY1A5SQsh 276w, https://miro.medium.com/max/1104/0*d1DzGO0rY1A5SQsh 552w, https://miro.medium.com/max/1280/0*d1DzGO0rY1A5SQsh 640w, https://miro.medium.com/max/1400/0*d1DzGO0rY1A5SQsh 700w" role="presentation" /></div></div></div></div></div><figcaption class="ke kf cu cs ct kg kh bo b ev bq br">Figure 3: Obliviate Workflow</figcaption></div></figure><p id="3916" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">When a consumer submits a DSR Erasure or Access and Portability request to Airbnb, that request gets forwarded to Obliviate. Obliviate propagates that request to downstream services by publishing it to a Kafka queue. Services that store and ‘own’ data at Airbnb are responsible for executing the DSR request by either deleting or fetching all of the personal data stored within their tables.</p><p id="1c5b" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In order to streamline and simplify how data services interact with Obliviate, we built Obliviate clients to support all data services. The clients provide services with empty Thrift IDL schemas that need to be filled in, one for each DSR request — erasure, access, and portability. The service owner fills in each schema with all columns that the service ‘owns’ that contain personal data.</p><p id="8ecc" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The clients also provide services with a common interface to implement, which contains several methods responsible for executing each DSR request given a user id. The client is responsible for abstracting away the rest of the logic (e.g initializing Kafka consumers and producers).</p><p id="4f00" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">For each DSR request, the Obliviate service monitors and waits for a response from each data service integrated with the client and notifies compliance upon completion. If a data service hasn’t responded, the service allows for multiple retries until it completes.</p><h2 id="39cb" class="kn ii do bo ij ko kp kq im kr ks kt ip ku kv kw it kx ky kz ix la lb lc jb ld el">Automating Obliviate Integrations</h2><p id="2cba" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Even with abstracting away a lot of the logic with the client code, integrating with Obliviate still took a lot of engineering effort. Service owners had to manually sift through their data to determine the exact columns that store personal data, which is very time consuming. They also had to integrate the client code and its dependencies within their service, which can take some time to test and debug. In addition to being time consuming, relying on service owners to determine all personal data in their data stores could be subject to error, since they might overlook a column or not be sure what that column contains.</p><p id="d7df" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">We decided to use the DPS to automate these integrations as much as possible. The automated integration runs as a daily job with the following steps:</p><ol class=""><li id="3adf" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr kl jj jk el">The DPS sends requests to Madoka and fetches the list of columns that contain personal data but have not been integrated with Obliviate yet, along with the service that owns each column.</li><li id="762f" class="gu gv do gw b gx jm gz ha hb jn hd he hf jo hh hi hj jp hl hm hn jq hp hq hr kl jj jk el">The DPS creates a PR for each service in that mapping that both integrates the service with the Obliviate client code, along with its dependencies, if it hasn’t been integrated already and appends each column associated with that service to the Thrift structs.</li><li id="cadd" class="gu gv do gw b gx jm gz ha hb jn hd he hf jo hh hi hj jp hl hm hn jq hp hq hr kl jj jk el">The DPS creates a JIRA ticket that links to the PR and assigns it to the service owner.</li></ol><p id="e490" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">All the service owner has to do is implement the three methods in the interface described above by deleting or returning all rows associated with that user from the columns included in the Thrift structures.</p><h1 id="c656" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Eliminating Accidental Secret Leakage</h1><p id="2554" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">In our <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-2-c2b8d2068216">previous blog post</a>, we described how we built Angmar to detect business and infrastructure “secrets” in code and how Inspekt detects personal data and business or infrastructure secrets in data stores and service logs. The DPS enables automated notifications and actions based on these findings and metadata from other upstream services in the Data Protection Platform. Next, we’ll take a look at a few examples of how the DPS eliminates such potential leakages at Airbnb.</p><h2 id="0029" class="kn ii do bo ij ko kp kq im kr ks kt ip ku kv kw it kx ky kz ix la lb lc jb ld el">“Secrets” in Data Stores and Logs</h2><p id="49bb" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Once an area of potential leakage is located, DPS automatically creates a security vulnerability ticket specifying the exact leakage point and assigns the ticket to the owner. Each ticket is filed with a tag that allows security operators to track the resolution of the ticket and collect metrics. After a detection of secrets in data stores and service logs, it is mandatory to find the proper service owner accountable for the detected records.</p><p id="ea03" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In the <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-1-c74909328e08">ownership section</a> of part 1 of our blog post, we described how Madoka service collects the service ownership property for our data assets. Once records are found, the DPS makes an API call to Madoka with the data asset metadata included within the detected record. For instance, for MySQL, the DPS sends a request to Madoka with the database cluster name and the table name within the call; for service logs, the DPS calls Madoka with the service name within the call. Madoka then responds with the corresponding team or individual “owner” of the assets.</p><p id="7bd6" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">To avoid further data leakage, tickets only contain data asset metadata instead of the detected data content. For instance, for detected records in service logs, we only record the service log code template that introduces the vulnerability and the secret type found during the scan in the ticket, but not the actual content. Once received by the owner, they are expected to discover the secret within their data stores and service logs.</p><p id="8ee1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">A bottleneck we observed after rolling out the DPS is that generated ticket resolution still needs manual verification. That is, when a ticket is resolved by the owner, the security team needs to verify that either the leaked secrets are removed from the data stores and service logs or the logging template leading to the leakage is removed from source code. To further reduce the operational cost, we plan to create an automated verification solution in future that triggers a regression scan when owners resolve a secret leakage ticket. For instance, for a resolved secret logging ticket, the DPS can trigger a scan over affected source code and see if the previous logging template is removed. The DPS can also trigger a scan over the affected logging cluster and search for the leaked secret to ensure that the secret is safely removed.</p><h2 id="6c0d" class="kn ii do bo ij ko kp kq im kr ks kt ip ku kv kw it kx ky kz ix la lb lc jb ld el">Secrets in Code</h2><p id="07b3" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">After a secret is detected within a CI check job, the CI job executes `git blame` to find the most recent contributor of the secret. In cases when the recent contributor has left the company, we trace back to the contributor’s management chain until we find a person that is active. After owner identification, the DPS performs a few operations:</p><ul class=""><li id="2b14" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr ji jj jk el"><strong class="gw jl">It de-duplicates secret findings: </strong>To avoid duplicate tickets and notifications for the same secret within the same file, we calculate a hash of the secret and the path name of the target file. When the hash value appears to be an existing value, we ignore the finding in DPS.</li><li id="6757" class="gu gv do gw b gx jm gz ha hb jn hd he hf jo hh hi hj jp hl hm hn jq hp hq hr ji jj jk el"><strong class="gw jl">It sends a notification: </strong>Alerts are sent to a dedicated Slack channel and Datadog for metrics collecting. When security operators are contacted, these notifications can serve as referees to provide contexts for proper guidance.</li></ul><p id="966f" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The DPS automates the secret data protection in the Airbnb codebase and minimizes operational load for security operations. In comparison with a pentesting program where pentesters manually triage secret leakages and operate the resolution process, Angmar incurs far fewer operations.</p><h1 id="743f" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Conclusion</h1><p id="1d1f" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">This post concludes our three-part series on how we are automating data protection at scale at Airbnb. We explained how understanding the data, by storing privacy and security metadata in a central service and by automatically classifying what type of data is stored where, is a necessary building block to protecting the data. In this blog post, we focused on use cases where the data protection platform helped us to reduce security and privacy risk.</p><p id="021b" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">If this type of work interests you, see <a class="bu hs" href="http://careers.airbnb.com/" rel="noopener ugc nofollow" target="_blank">careers.airbnb.com</a> for current openings.</p><h1 id="a2c6" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Acknowledgments</h1><p id="d66c" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">The Data Protection Platform was made possible by all team members of the data security team: Shengpu Liu, Zi Liu, Jesse Rosenbloom, Serhi Pichkurov, and Julia Cline. Thanks to our leadership, Marc Blanchou, Joy Zhang, Brendon Lynch, Paul Nikhinson, and Vijaya Kaza, for supporting our work. Thanks to Christopher Barcellos for reviewing our blog post. Thanks to the Trust Privacy team for the great partnership: Jujhaar Singh, Ansuman Acharya, Zoya Sultana, Steve Hill, Liam McInerney, Mamman Fan, Gustavo Alza, Shazad Sahak, Alice Park, Eliott Behar etc. Thanks to the vulnerability management team for building out the ticketing mechanism: Kadia Mashal, Keziah Plattner. Thanks to the data governance team for partnering and supporting our work: Andrew Luo, Shawn Chen, and Liyin Tang. Thank you Tina Nguyen and Cristy Schaan for helping drive and make this blog post possible. Thank you to previous members of the team who contributed greatly to the work: Lifeng Sang, Bin Zeng, Prasad Kethana, Alex Leishman, and Julie Trias.</p><p id="024e" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="lg">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-3-34e592c45d46</link>
      <guid>https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-3-34e592c45d46</guid>
      <pubDate>Thu, 16 Dec 2021 19:32:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Airbnb’s Page Performance Score on iOS]]></title>
      <description><![CDATA[<div class=""><div class="em"><div class="n en eo ep eq"><div class="o n"><div><a rel="noopener follow" href="https://medium.com/@nickbryanmiller?source=post_page-----36d5f200bc73-----------------------------------"><img alt="Nicholas Miller" class="s er es et" src="https://miro.medium.com/fit/c/96/96/1*KK7Mlk2Y3MgUt4kkNcSQaw.png" width="48" height="48" /></a></div><div class="eu aj s"><div class="n"><div><div class="ew n o ex"><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/@nickbryanmiller?source=post_page-----36d5f200bc73-----------------------------------">Nicholas Miller</a><div class="eu n"><div class="fv s"><div><div><div class="ft" role="tooltip" aria-hidden="false"><div class="s"></div></div></div></div></div></div></div></div><div><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/airbnb-engineering/airbnbs-page-performance-score-on-ios-36d5f200bc73?source=post_page-----36d5f200bc73-----------------------------------">Dec 13</a> ·  Unlisted</div></div></div><div class="n gi gj gk gl gm gn go gp z"><div class="n o"></div></div></div></div><p id="1c4c" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el"><em class="hq">This is a continuation of our series on </em><a class="bu hr" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936"><em class="hq">Airbnb’s Page Performance Score</em></a><em class="hq">, a score that measures </em>multiple performance metrics from real users on any platform. Series: <a class="bu hr" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">Part 1</a> and <a class="bu hr" rel="noopener" href="https://medium.com/airbnb-engineering/measuring-web-performance-at-airbnb-122da8d3ea3f">Part 2</a>.</p><p id="1227" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el"><a href="https://www.linkedin.com/in/nickbryanmiller/" class="bu hr" rel="noopener ugc nofollow" target="_blank">Nicholas Miller</a></p><figure class="ht hu hv hw hx hy cs ct paragraph-image"><div role="button" tabindex="0" class="hz ia ib ic aj id"><div class="cs ct hs"><img alt="" class="aj ie if" src="https://miro.medium.com/max/1400/1*33B60glCNf0ePfNHrvwYew.jpeg" width="700" height="467" role="presentation" /></div></div></figure><p id="7402" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el">At Airbnb, we created the <a class="bu hr" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">Page Performance Score</a> to provide our engineers and data scientists a multitude of user-centric performance metrics to better understand and improve our products. In this post, we will dive deeper into how we define these metrics and instrument them on iOS.</p><h1 id="271e" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Page System</h1><p id="852a" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">The entire customer journey on Airbnb is divided into different pages, each of which has its own measured <a class="bu hr" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">Page Performance Score</a> (PPS). In order to support this page-based performance tracking system, we built a standardized infrastructure that enables engineers to configure pages representing their features.</p><p id="19e0" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el">On iOS, a page is associated with a <em class="hq">UIViewController</em>. We collect performance data throughout a <em class="hq">UIViewController’s</em> lifecycle and only emit the logging event on <em class="hq">viewDidDisappear</em>. This logging event cannot be created or sent without a <em class="hq">PageName,</em> a universal page identifier.</p><h1 id="a0b2" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Instrumentation</h1><p id="c4fb" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">Due to the many edge cases and complexities involved in instrumenting these metrics, we created a Page Performance Score state machine class, called <a href="https://docs.google.com/document/d/1itSfqXZaoU9sAG79HWlpE9euZ5UU8wMhhNTSc3_mua4/edit#heading=h.nfbrai4oupy3" class="bu hr" rel="noopener ugc nofollow" target="_blank"><em class="hq">PPSStateMachine</em></a>. This class encapsulates all the logic to track and compute the performance metrics and generate logging events. Any engineer who wants to log a PPS event can do so by obtaining the <em class="hq">PPSStateMachine</em> associated with their <em class="hq">UIViewController</em> and calling the relevant methods during the <em class="hq">UIViewController’s</em> lifecycle events. To make things even simpler, we’ve built additional tooling and infrastructure so engineers only need to provide a name for their page and the state of the content — e.g., loading, loaded, or error.</p><h1 id="e2f1" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">PPSStateMachine</h1><figure class="ht hu hv hw hx hy"><div class="jh s ib"></div></figure><h1 id="f346" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Time</h1><p id="c037" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">When measuring performance, all time is measured in nanoseconds and then converted into milliseconds. By creating a typealias for the concept of nanoseconds (UInt64) and milliseconds (Float64) to more specific types, we force developers to think about the scale when converting to more commonly used types (e.g., Int, Float).</p><p id="938a" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el">When taking the current time, we use a monotonic clock, a clock whose value increments monotonically and will continue to increment while the system is asleep. The value is of type 64-bit nanoseconds.</p><figure class="ht hu hv hw hx hy"><div class="jh s ib"></div></figure><p id="d29f" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el">When marking the start and end time of a duration, we have a computed variable that returns the current time in milliseconds. This allows us to avoid most accuracy and precision errors due to casting.</p><figure class="ht hu hv hw hx hy"><div class="jh s ib"></div></figure><h2 id="116d" class="jk ih do bo ii jl jm jn il jo jp jq io jr js jt is ju jv jw iw jx jy jz ja ka el">Example</h2><figure class="ht hu hv hw hx hy"><div class="jh s ib"></div></figure><h1 id="c66b" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">View Association</h1><p id="f62e" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">Every <em class="hq">UIViewController</em> has an associated <em class="hq">PPSStateMachine</em>. This <em class="hq">PPSStateMachine</em> can be overridden in the event the developer wants to measure a series of pages under one name. Associating with a <em class="hq">UIViewController</em> allows the <em class="hq">PPSStateMachine</em> to be found on a <em class="hq">UIView</em> by crawling the view responder chain.</p><h1 id="848f" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Versioning</h1><p id="1747" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">Declaring lifecycle and semantic methods in the PPS protocol allows us to abstract away how the score is being calculated. Most updates to the PPS formula — with the exception of entirely new metrics such as video performance — do not result in developers needing to update their respective features. Behind the scenes, any major change to the formula is first tested by placing the potential value into the logged event’s metadata. Once the potential value is validated, it can be upgraded to an official value that affects the page’s performance score.</p><h1 id="25c4" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Metric Implementation</h1><h1 id="132f" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Time to First Layout (TTFL)</h1><p id="7b65" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">TTFL starts during the UIViewController’s viewDidLoad and ends after the UIViewController’s first viewDidLayoutSubviews.</p><h1 id="dbd0" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Time to Initial Load (TTIL)</h1><p id="5932" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">TTIL starts during the UIViewController’s viewDidLoad and ends one render cycle after loaded content has been set.</p><figure class="ht hu hv hw hx hy cs ct paragraph-image"><div class="cs ct kb"><div class="jh s ib kh"><div class="ki jj s"><div class="kc kd t u v ke aj at kf kg"><div class="ftr-noscript"><img alt="" class="t u v ke aj" src="https://miro.medium.com/max/1200/1*EW0b3z7ZpIrJhtzI0W2DSA.gif" width="600" height="1067" srcset="https://miro.medium.com/max/552/1*EW0b3z7ZpIrJhtzI0W2DSA.gif 276w, https://miro.medium.com/max/1104/1*EW0b3z7ZpIrJhtzI0W2DSA.gif 552w, https://miro.medium.com/max/1200/1*EW0b3z7ZpIrJhtzI0W2DSA.gif 600w" role="presentation" /></div></div></div></div><figcaption class="km kn cu cs ct ko kp bo b ev bq br">This is for illustrative purposes only and does not necessarily show anything that may or may not be available on Airbnb at any time. The content shown in the image may or may not be correct.</figcaption></div></figure><h1 id="4c63" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Scroll Thread Hangs (STH)</h1><p id="37d5" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">STHs are reported as the difference between the duration of the hitch, filtering on a minimum threshold of twice the refresh rate, and the maximum frame duration.</p><figure class="ht hu hv hw hx hy"><div class="jh s ib"></div></figure><p id="51e2" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el"><a href="https://developer.apple.com/documentation/quartzcore/cadisplaylink" class="bu hr" rel="noopener ugc nofollow" target="_blank"><em class="hq">CADisplayLink</em></a> accurately observes most STHs. The <em class="hq">RunLoop.Mode</em> is <em class="hq">RunLoop.Mode.Tracking</em>.</p><figure class="ht hu hv hw hx hy"><div class="jh s ib"></div></figure><p id="2b6b" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el">Every time the display link is fired, we make a calculation based on the old frame and the current frame.</p><figure class="ht hu hv hw hx hy cs ct paragraph-image"><div class="cs ct kb"><div class="jh s ib kh"><div class="ki jj s"><div class="kc kd t u v ke aj at kf kg"><div class="ftr-noscript"><img alt="" class="t u v ke aj" src="https://miro.medium.com/max/1200/1*D9y4xHtuKdAEf1ytdM-iEQ.gif" width="600" height="1067" srcset="https://miro.medium.com/max/552/1*D9y4xHtuKdAEf1ytdM-iEQ.gif 276w, https://miro.medium.com/max/1104/1*D9y4xHtuKdAEf1ytdM-iEQ.gif 552w, https://miro.medium.com/max/1200/1*D9y4xHtuKdAEf1ytdM-iEQ.gif 600w" role="presentation" /></div></div></div></div><figcaption class="km kn cu cs ct ko kp bo b ev bq br">This is for illustrative purposes only and does not necessarily show anything that may or may not be available on Airbnb at any time. The content shown in the image may or may not be correct.</figcaption></div></figure><p id="e76c" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el">Main Thread Hangs (MTH) tracking could exist on iOS, however, accurately tracking MTH incurs a small but consistent drag on performance. In our tests of MTH tracking, the CPU was not able to sleep, battery was drained, and the metric wasn’t giving us significantly more information regarding visually-perceived performance than STH. As a result, we decided not to measure MTH on iOS.</p><h1 id="614f" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Additional Load Time (ALT)</h1><p id="6e26" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">ALT starts when a loader is shown and ends one render cycle after the loader is gone and content is set.</p><p id="83e0" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el">To illustrate this metric, let’s take a look at infinite scroll. If the bottom is reached before the next page has been loaded then the ALT recorded is the time that the loader (or bottom) is visible until the next page has loaded. If the bottom is never reached, for instance, due to prefetching, then an ALT of zero is logged. In order to accurately log, we need to know the scroll percentage, whether the bottom loader is visible, and a state machine to track the old state.</p><figure class="ht hu hv hw hx hy cs ct paragraph-image"><div class="cs ct kb"><div class="jh s ib kh"><div class="ki jj s"><div class="kc kd t u v ke aj at kf kg"><div class="ftr-noscript"><img alt="" class="t u v ke aj" src="https://miro.medium.com/max/1200/1*WTBC00sL2fp9MW5xom2jpA.gif" width="600" height="1067" srcset="https://miro.medium.com/max/552/1*WTBC00sL2fp9MW5xom2jpA.gif 276w, https://miro.medium.com/max/1104/1*WTBC00sL2fp9MW5xom2jpA.gif 552w, https://miro.medium.com/max/1200/1*WTBC00sL2fp9MW5xom2jpA.gif 600w" role="presentation" /></div></div></div></div><figcaption class="km kn cu cs ct ko kp bo b ev bq br">This is for illustrative purposes only and does not necessarily show anything that may or may not be available on Airbnb at any time. The content shown in the image may or may not be correct.</figcaption></div></figure><h1 id="1bb7" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Rich Content Load Time (RCLT)</h1><p id="7d84" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">RCLT is entirely hidden from engineers with our view abstraction, <em class="hq">URLImageView</em>, which is capable of showing an image from a URL.</p><p id="f201" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el">RCLT only tracks the time that a loader or placeholder is visible. If a loading image is hidden then the act of hiding marks the end of the RCLT.</p><p id="38f0" class="gs gt do gu b gv gw gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp dg el">On every <em class="hq">URLImageView</em> state change the corresponding <em class="hq">PPSStateMachine</em> is found by crawling the view’s responder chain and updating the state machine with whether the image is loaded or not. The <em class="hq">PPSStateMachine</em> will calculate the duration and remove the URL portion, only saving the duration, if the duration is under a specified threshold so that logs are not too large.</p><figure class="ht hu hv hw hx hy"><div class="jh s ib"></div></figure><figure class="ht hu hv hw hx hy cs ct paragraph-image"><div class="cs ct kb"><div class="jh s ib kh"><div class="ki jj s"><div class="kc kd t u v ke aj at kf kg"><div class="ftr-noscript"><img alt="" class="t u v ke aj" src="https://miro.medium.com/max/1200/0*FDRRuig-2gi_9IZP" width="600" height="1067" srcset="https://miro.medium.com/max/552/0*FDRRuig-2gi_9IZP 276w, https://miro.medium.com/max/1104/0*FDRRuig-2gi_9IZP 552w, https://miro.medium.com/max/1200/0*FDRRuig-2gi_9IZP 600w" role="presentation" /></div></div></div></div><figcaption class="km kn cu cs ct ko kp bo b ev bq br">This is for illustrative purposes only and does not necessarily show anything that may or may not be available on Airbnb at any time. The content shown in the image may or may not be correct.</figcaption></div></figure><h1 id="d849" class="ig ih do bo ii ij ik gx il im in hb io ip iq ir is it iu iv iw ix iy iz ja jb el">Summary</h1><p id="3d92" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">Our current implementation of PPS on iOS has allowed engineers to quickly implement and receive real performance data. We are continually evolving and expanding our tooling and infrastructure. We hope that you can apply and advance our learnings in your company.</p><h2 id="58a2" class="jk ih do bo ii jl jm jn il jo jp jq io jr js jt is ju jv jw iw jx jy jz ja ka el">Appreciations</h2><p id="75b5" class="gs gt do gu b gv jc gx gy gz jd hb hc hd je hf hg hh jf hj hk hl jg hn ho hp dg el">Thank you to everyone who has helped build PPS on Native: <a href="https://www.linkedin.com/search/results/all/?keywords=luping%20lin&amp;origin=RICH_QUERY_SUGGESTION&amp;position=0&amp;searchId=58011edb-813b-43c3-9f00-f886aa446e84&amp;sid=VYi" class="bu hr" rel="noopener ugc nofollow" target="_blank">Luping Lin</a>, <a href="https://www.linkedin.com/search/results/all/?keywords=joshua%20nelson%20%E2%9C%A8&amp;origin=RICH_QUERY_SUGGESTION&amp;position=0&amp;searchId=959d4aca-c80e-448a-b415-4a732ba7a84d&amp;sid=Rr6" class="bu hr" rel="noopener ugc nofollow" target="_blank">Josh Nelson</a>, <a href="https://www.linkedin.com/in/hdezninirola/" class="bu hr" rel="noopener ugc nofollow" target="_blank">Antonio Niñirola</a>, <a href="https://www.linkedin.com/in/kellerbryan19/" class="bu hr" rel="noopener ugc nofollow" target="_blank">Bryan Keller</a>, <a href="https://www.linkedin.com/in/noahsmartin/" class="bu hr" rel="noopener ugc nofollow" target="_blank">Noah Martin</a>, <a href="https://www.linkedin.com/in/scheuermann/" class="bu hr" rel="noopener ugc nofollow" target="_blank">Andrew Scheuermann</a>, <a href="https://www.linkedin.com/search/results/all/?keywords=joshua%20nelson%20%E2%9C%A8&amp;origin=RICH_QUERY_SUGGESTION&amp;position=0&amp;searchId=959d4aca-c80e-448a-b415-4a732ba7a84d&amp;sid=Rr6" class="bu hr" rel="noopener ugc nofollow" target="_blank">Josh Nelson</a>, <a href="https://www.linkedin.com/in/joshpolsky/" class="bu hr" rel="noopener ugc nofollow" target="_blank">Josh Polsky</a>, <a href="https://www.linkedin.com/in/jnvollmer/" class="bu hr" rel="noopener ugc nofollow" target="_blank">Jean-Nicolas Vollmer</a>, <a href="https://www.linkedin.com/in/wensheng-mao-76ab7142/" class="bu hr" rel="noopener ugc nofollow" target="_blank">Wensheng Mao</a> and everyone else who helped along the way.</p></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/airbnbs-page-performance-score-on-ios-36d5f200bc73</link>
      <guid>https://medium.com/airbnb-engineering/airbnbs-page-performance-score-on-ios-36d5f200bc73</guid>
      <pubDate>Mon, 13 Dec 2021 16:41:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[How Airbnb Supports Co-Hosting]]></title>
      <description><![CDATA[<div class=""><div class="em"><div class="n en eo ep eq"><div class="o n"><div><a rel="noopener follow" href="https://medium.com/@angierao?source=post_page-----edfb11d88575-----------------------------------"><img alt="Angeline Rao" class="s er es et" src="https://miro.medium.com/fit/c/96/96/2*Sz6ox3mfes6XO1BCD2I2pQ.png" width="48" height="48" /></a></div><div class="eu aj s"><div class="n"><div><div class="ew n o ex"><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/@angierao?source=post_page-----edfb11d88575-----------------------------------">Angeline Rao</a><div class="eu n"><div class="fv s"><div><div><div class="ft" role="tooltip" aria-hidden="false"><div class="s"></div></div></div></div></div></div></div></div><div><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/airbnb-engineering/how-airbnb-supports-co-hosting-edfb11d88575?source=post_page-----edfb11d88575-----------------------------------">Dec 9</a> · 11 min read</div></div></div><div class="n gi gj gk gl gm gn go gp z"><div class="n o"><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div><p id="30e1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">A deep dive into Airbnb’s collaborative hosting infrastructure</p><p id="0cd8" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">By: <a href="https://www.linkedin.com/in/angelinerao/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Angeline Rao</a></p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ht"><img alt="" class="aj if ig" src="https://miro.medium.com/max/1400/1*k8WLBeLdPKkv-MrppJi8kQ.jpeg" width="700" height="525" role="presentation" /></div></div></figure><h1 id="8d9e" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Introduction</h1><p id="f5de" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Airbnb’s mission is to empower Hosts to deliver one-of-a-kind stays that make it possible for guests to experience the world in a more authentic and connected way. Sometimes hosting is handled by one person, but in many cases hosting is a group effort. Hosts often share their responsibilities with another trusted person, such as a family member or a neighbor. These trusted partners are Co-Hosts on the Airbnb platform who are granted access to the Host’s listing, reservations, and messaging with guests.</p><p id="d82c" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Co-Hosting is just one form of Host collaboration. As hosting has become mainstream, the scale of hosting has grown as well; in fact, many people now host on Airbnb as their primary occupation. From Host entrepreneurs running their own businesses, to Hosts that are part of established hospitality companies, these types of Hosts collaborate through a Team on Airbnb. Within a Team, hosting team members are granted roles that correspond to their real world hosting responsibilities (e.g., guest manager) and have a set of corresponding permissions (e.g., permitted to message guests).</p><p id="5bc6" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">As the number of collaborative Hosts grows and new forms of collaboration get introduced, the engineering work to support them becomes increasingly complex. With this challenge in mind, Airbnb has developed a single common infrastructure that can support all current and future Airbnb collaboration products. This solution is now available for all internal teams.</p><p id="4a6b" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In this blog post, we will cover the unified architecture of collaborative hosting at Airbnb and how we use this shared infrastructure to streamline the process of building products for Hosts. In the next section, we will illustrate why supporting collaborative hosting without a shared infrastructure quickly became unwieldy. Then, we will walk through Airbnb’s architecture for collaborative hosting. Finally, we will discuss how this infrastructure supports the needs of product engineers.</p><h1 id="6618" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Background &amp; Motivation</h1><p id="a9b9" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Before we jump into collaborative hosting, let’s consider the single Host model. Because only one person is associated with each listing, this data could easily be stored with a <code class="ji jj jk jl jm b">host_id</code> column in our listings database. We can then perform a single check to figure out whether a user has permission to take an action on a listing. This might look like the following:</p><pre class="hu hv hw hx hy jn gb bd">if (isListingHost) {<br />     // Take action on listing<br />}</pre><p id="b01d" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">When Airbnb launched its first collaborative hosting product, Co-Hosting, we used these types of comparisons, just as we did with the single host model.</p><p id="4454" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In our business logic, we need frequently to answer three types of questions around person-to-resource (e.g., listing, reservation, review) relationships:</p><ol class=""><li id="268e" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jt ju jv el"><strong class="gw jw">Permissions</strong>: Given a person and a resource, can this person take action <code class="ji jj jk jl jm b">X</code> on this resource? The answer to this question helps us ensure that people only take actions on Airbnb that they are explicitly permitted to take. We must not permit someone to edit the price on an unrelated Host’s listing, for example.</li><li id="0f9b" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr jt ju jv el"><strong class="gw jw">Collection Queries</strong>: Given a person, what are the resources that they can access? The answer to this question helps us determine which message threads to display in a person’s Airbnb inbox, for example.</li><li id="2e93" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr jt ju jv el"><strong class="gw jw">Hosts to Display and Notify</strong>: Given a resource, who should be displayed to guests, and who should be notified of updates to this resource? The answer to this question helps us determine who should be displayed as the Host(s) of this listing, for example.</li></ol><p id="43d7" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">After Co-Hosting, we soon found that performing the types of comparisons that we did for the single host model does not scale well to collaborative and more complex use cases.</p><ul class=""><li id="8f62" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr kc ju jv el">Continually adding logic that is specific to a single type of collaborative hosting results in unwieldy code. For example, permissions checks might start to look like the following:</li></ul><pre class="hu hv hw hx hy jn gb bd">if (isListingHost || <br />    isListingCoHost || <br />    isListingTeamMember ||  <br />    isListingCollabHost1 || <br />    isListingCollabHost2 || <br />    ...) {<br />     // Take action on listing<br />}</pre><ul class=""><li id="3e77" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr kc ju jv el">Engineers who are building a new feature need to understand all of the existing types of collaborative hosting and decide how collaborative Hosts should interact with the feature (e.g., which types of Team members should have access to this feature?). If engineers do not include every use case, the feature will not be available to all Hosts.</li></ul><p id="90c1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Without any kind of unifying framework, product development for Hosts can quickly become a laborious process.</p><p id="b4d8" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Today, because of our collaborative hosting infrastructure, product engineers do not need to worry about specific types of collaborative hosting. They only need to know three things, all of which we will cover in this post:</p><ol class=""><li id="32bb" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jt ju jv el">For permissions, query Himeji.</li><li id="7d94" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr jt ju jv el">For collection queries, use the resource’s dedicated service.</li><li id="2349" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr jt ju jv el">For Hosts to display or notify, use the Collaborative Hosting API.</li></ol><h1 id="4686" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Collaborative Hosting Core Architecture</h1><p id="2377" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">We use user groups as the data model to represent any group of people. A user group is defined by an id, a group type (e.g., <code class="ji jj jk jl jm b">COHOSTING</code>, <code class="ji jj jk jl jm b">TEAM</code>), and a list of user group members.</p><p id="d458" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Each member in a user group is defined by their Airbnb user id and a user group role, which allows us to differentiate between the different types of members within a user group. For example, if a Host (listing owner) has a Co-Host, then the corresponding user group would be a user group with type <code class="ji jj jk jl jm b">COHOSTING</code> that has two members: the Host, who has the <code class="ji jj jk jl jm b">LISTING_OWNER</code> role, and the Co-Host, who has the <code class="ji jj jk jl jm b">LISTING_COHOST</code> role.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct kd"><div class="kj s ic ji"><div class="kk kl s"><div class="ke kf t u v kg aj at kh ki"><div class="ftr-noscript"><img alt="Diagram of how a Host and their Co-Host represented in the user group data model" class="t u v kg aj" src="https://miro.medium.com/max/1400/1*6JfNAQlj8Ml9LUHRxlh8AA.png" width="700" height="334" srcset="https://miro.medium.com/max/552/1*6JfNAQlj8Ml9LUHRxlh8AA.png 276w, https://miro.medium.com/max/1104/1*6JfNAQlj8Ml9LUHRxlh8AA.png 552w, https://miro.medium.com/max/1280/1*6JfNAQlj8Ml9LUHRxlh8AA.png 640w, https://miro.medium.com/max/1400/1*6JfNAQlj8Ml9LUHRxlh8AA.png 700w" /></div></div></div></div></div><figcaption class="kp kq cu cs ct kr ks bo b ev bq br"><em class="kt">A Host and their Co-Host represented in the user group data model</em></figcaption></div></figure><p id="35a4" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">This model is extensible to hosting teams as well. We support several roles specific to Teams based on how hosting teams commonly break down responsibilities between team members, such as the <code class="ji jj jk jl jm b">LISTING_MANAGER</code> role, the <code class="ji jj jk jl jm b">FINANCE_MANAGER</code> role, and the <code class="ji jj jk jl jm b">GUEST_MANAGER</code> role.</p><p id="e4e8" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In the creation and deletion flows for a Co-Host or Team, the corresponding user group is updated accordingly.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ku"><div class="kj s ic ji"><div class="kv kl s"><div class="ke kf t u v kg aj at kh ki"><div class="ftr-noscript"><img alt="Diagram showing that when Co-Hosts or Teams get updated in the product, in addition to the Co-Host and Teams sources of truth getting updated, the corresponding user groups get updated as well" class="t u v kg aj" src="https://miro.medium.com/max/1400/0*DV48R3dUKBjEryHK" width="700" height="350" srcset="https://miro.medium.com/max/552/0*DV48R3dUKBjEryHK 276w, https://miro.medium.com/max/1104/0*DV48R3dUKBjEryHK 552w, https://miro.medium.com/max/1280/0*DV48R3dUKBjEryHK 640w, https://miro.medium.com/max/1400/0*DV48R3dUKBjEryHK 700w" /></div></div></div></div></div><figcaption class="kp kq cu cs ct kr ks bo b ev bq br">Updates in product will trigger changes in both the source of truth (Co-Hosting or Teams) and the corresponding user groups</figcaption></div></figure><h2 id="d3f6" class="jo ii do bo ij kw kx ky im kz la lb ip lc ld le it lf lg lh ix li lj lk jb ll el"><strong class="ca">Resource &lt;&gt; User Group Associations</strong></h2><p id="8f9d" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Now that we have a model for any collaborative hosting group, we want to associate each group with the group’s corresponding resources. This way, when we are trying to answer questions around whether a person has a relation with a given resource, there is a single source that will give us the answer, regardless of the specific collaborative relationship. We keep track of these resource &lt;&gt; user group associations by storing the Airbnb resource id, the user group id, and the timestamp when the association was created.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct lm"><div class="kj s ic ji"><div class="ln kl s"><div class="ke kf t u v kg aj at kh ki"><div class="ftr-noscript"><img alt="Example ListingUserGroupAssociations table showing that listing A has been associated with user group C and listing B has been associated with user group D" class="t u v kg aj" src="https://miro.medium.com/max/1400/0*QH1mAPP9qNiJy0sM" width="700" height="185" srcset="https://miro.medium.com/max/552/0*QH1mAPP9qNiJy0sM 276w, https://miro.medium.com/max/1104/0*QH1mAPP9qNiJy0sM 552w, https://miro.medium.com/max/1280/0*QH1mAPP9qNiJy0sM 640w, https://miro.medium.com/max/1400/0*QH1mAPP9qNiJy0sM 700w" /></div></div></div></div></div><figcaption class="kp kq cu cs ct kr ks bo b ev bq br"><em class="kt">Example ListingUserGroupAssociations table showing that listing A has been associated with user group C and listing B has been associated with user group D</em></figcaption></div></figure><p id="c948" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">There are two scenarios in which resource &lt;&gt; user group associations need to be updated:</p><ol class=""><li id="67a4" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jt ju jv el">When a collaborative hosting relationship gets updated. For example, when a hosting team gets created, all of the Team creator’s resources get associated with the Team’s corresponding user group</li><li id="b574" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr jt ju jv el">When a collaborative hosting resource is updated. For example, when a guest books a reservation on a Co-Hosted listing, we need to associate the Co-Host user group with the new reservation so that the listing’s Co-Hosts can help the listing owner with hosting.</li></ol><p id="46f7" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">If updates in response to these events do not happen in a timely manner, the product experience might fall out of date. For example, if a Host adds a Co-Host to a listing, but the underlying association is not updated, the Co-Host will not have access to the listing and its reservations.</p><p id="2d3d" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">In a business of Airbnb’s size, keeping resource &lt;&gt; user group associations up to date can be challenging. The state of affairs is constantly changing, sometimes in quick succession; a Host might create a hosting team and then change their mind and immediately delete it. As a result, race conditions do occur.</p><p id="f370" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">For the rest of this section, we will cover Airbnb’s scalable system to keep resource &lt;&gt; user group associations up to date in spite of race conditions. In the subsequent section, we detail how Airbnb leverages these resource &lt;&gt; user group associations during product development.</p><p id="7514" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><strong class="gw jw">A System for Achieving Accurate Resource &lt;&gt; User Group Associations</strong></p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct lm"><div class="kj s ic ji"><div class="kv kl s"><div class="ke kf t u v kg aj at kh ki"><div class="ftr-noscript"><img alt="Diagram of the system that keeps resource &lt;&gt; user group associations updated" class="t u v kg aj" src="https://miro.medium.com/max/1400/0*21elv_AMVnSJojew" width="700" height="350" srcset="https://miro.medium.com/max/552/0*21elv_AMVnSJojew 276w, https://miro.medium.com/max/1104/0*21elv_AMVnSJojew 552w, https://miro.medium.com/max/1280/0*21elv_AMVnSJojew 640w, https://miro.medium.com/max/1400/0*21elv_AMVnSJojew 700w" /></div></div></div></div></div><figcaption class="kp kq cu cs ct kr ks bo b ev bq br"><em class="kt">The system that keeps resource &lt;&gt; user group associations updated</em></figcaption></div></figure><p id="0610" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">When any resource changes, our system springs into action and fetches affected resources to perform association updates. Because there can be thousands of resources to fetch, we use an internal job queue and scheduling system to break down the work into jobs to avoid timeouts and process in parallel. For all affected resources, we compare their user group associations with the current state of Airbnb and update the associations if needed.</p><p id="be80" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The side effect of processing updates in parallel is that there could be race conditions that result in inconsistencies between a source of truth and corresponding user group associations. For example, if a Host creates and then immediately deletes a Team, the resulting jobs would be executed in parallel, with the possibility of a downstream “create” job executing after a “delete” job.</p><p id="ed10" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">To address any inconsistencies introduced, the system will next fetch the now-updated resource &lt;&gt; user group associations and compare them with the source of truth. If there are any mismatches, it fixes them using a resilient queuing system that guarantees eventual consistency.</p><p id="095e" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">There are two notable benefits to the design of this system:</p><ul class=""><li id="4864" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr kc ju jv el"><strong class="gw jw">Optimized Performance</strong>: Performing the anticipated updates first and then fixing inconsistencies retroactively allows for the vast majority of the processing to be done in batches and in parallel. This results in a much less expensive operation than if the resources were processed one-by-one right off the bat, even though the latter approach would eliminate the need for the step to fix inconsistencies.</li><li id="1232" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr kc ju jv el"><strong class="gw jw">Idempotent</strong>: Each resource update event triggers a re-calculation of associations that is agnostic to the specific type of update. As a result, we do not need to worry about receiving two opposite events, such as create and delete, in the wrong order. We thus have the guarantee that our system updates are idempotent.</li></ul><h1 id="0bd5" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Collaborative Hosting Infrastructure in Product Development</h1><p id="c62a" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Now that we have a system for achieving accurate user group &lt;&gt; resource associations, let’s revisit the three types of questions around person-to-resource relationships that engineers need to answer during development.</p><h2 id="b949" class="jo ii do bo ij kw kx ky im kz la lb ip lc ld le it lf lg lh ix li lj lk jb ll el"><strong class="ca">1. Permissions</strong></h2><p id="5f20" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">When checking if a user can edit the pricing on a listing, for example, we will now know whether the person is associated with this listing in any collaborative sense by checking the listing associations of the user’s user groups. We could find out, for example, that this user is a <code class="ji jj jk jl jm b">LISTING_MANAGER</code> team member in a user group associated with this listing.</p><p id="5569" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The last piece, then, is defining how roles map to actions — for example, we might decide that users who have the <code class="ji jj jk jl jm b">LISTING_COHOST</code> or <code class="ji jj jk jl jm b">LISTING_MANAGER</code> role can take the <code class="ji jj jk jl jm b">EDIT_PRICING</code> action on listings with which their user group is associated. At Airbnb, this mapping happens in configs defined in Himeji, our central authorization framework described in <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/himeji-a-scalable-centralized-system-for-authorization-at-airbnb-341664924574">this previous blog post</a>. Given a user, a resource, and an action, Himeji computes whether that user is permitted to take the action on the resource.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct lo"><div class="kj s ic ji"><div class="lp kl s"><div class="ke kf t u v kg aj at kh ki"><div class="ftr-noscript"><img alt="Architecture diagram for Himeji where Himeji is pointing into data sources for resource user group associations, resources, and user groups" class="t u v kg aj" src="https://miro.medium.com/max/1400/0*Vr2q-tgY6hqsd6xM" width="700" height="468" srcset="https://miro.medium.com/max/552/0*Vr2q-tgY6hqsd6xM 276w, https://miro.medium.com/max/1104/0*Vr2q-tgY6hqsd6xM 552w, https://miro.medium.com/max/1280/0*Vr2q-tgY6hqsd6xM 640w, https://miro.medium.com/max/1400/0*Vr2q-tgY6hqsd6xM 700w" /></div></div></div></div></div><figcaption class="kp kq cu cs ct kr ks bo b ev bq br">Architecture diagram for Himeji</figcaption></div></figure><h2 id="f0ea" class="jo ii do bo ij kw kx ky im kz la lb ip lc ld le it lf lg lh ix li lj lk jb ll el"><strong class="ca">2. Collection Queries</strong></h2><p id="ff6f" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">To fetch the resources that a person can access, we just need to make a single query by the person’s Airbnb user id and by the user group ids where the person has a permitted user group role. We use ElasticSearch to make resources searchable by user group id by joining the resource data source with the resource &lt;&gt; user group association table. This way, each resource’s ElasticSearch document has the list of user group ids that it is associated with.</p><p id="5ef4" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Each resource’s designated data service then serves an endpoint (e.g., <code class="ji jj jk jl jm b">getListingsByFilters</code>) that allows product engineers to pass a parameter (e.g., <code class="ji jj jk jl jm b">includeCollaborativeHosting=true</code>) to indicate that resources should be fetched by both the person’s user id and user group ids.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct lq"><div class="kj s ic ji"><div class="lr kl s"><div class="ke kf t u v kg aj at kh ki"><div class="ftr-noscript"><img alt="Architecture diagram for collection queries showing mutations from resource user group associations and resources feeding into ElasticSearch, which gets queried, along with the user groups data source" class="t u v kg aj" src="https://miro.medium.com/max/1400/1*W-kl0OYyPkZviH4ySDmsDQ.png" width="700" height="389" srcset="https://miro.medium.com/max/552/1*W-kl0OYyPkZviH4ySDmsDQ.png 276w, https://miro.medium.com/max/1104/1*W-kl0OYyPkZviH4ySDmsDQ.png 552w, https://miro.medium.com/max/1280/1*W-kl0OYyPkZviH4ySDmsDQ.png 640w, https://miro.medium.com/max/1400/1*W-kl0OYyPkZviH4ySDmsDQ.png 700w" /></div></div></div></div></div><figcaption class="kp kq cu cs ct kr ks bo b ev bq br">Architecture diagram for collection queries</figcaption></div></figure><p id="a282" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Note that, similar to above, we did not need to know anything about Co-Hosting or Teams specifically to answer these types of questions.</p><h2 id="bd75" class="jo ii do bo ij kw kx ky im kz la lb ip lc ld le it lf lg lh ix li lj lk jb ll el"><strong class="ca">3. Determining Which Hosts to Display and Notify</strong></h2><p id="ebc9" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">The user(s) to display to guests or to notify about an update are not necessarily the same as the user(s) who have the corresponding permissions. For example, a Host who has a Co-Host may not want to receive notifications about guest messages, but they still want to access their Airbnb inbox.</p><p id="4073" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">As we saw previously, keeping track of the bespoke logic for each type of collaboration can become taxing for product teams. To address this issue, we built out a Collaborative Hosting API that takes all collaborative use cases into account, with endpoints such as <code class="ji jj jk jl jm b">getManagersToNotifyForReservation</code> and <code class="ji jj jk jl jm b">getManagersToDisplay</code>. Under the hood, we query the source of truth for each collaborative hosting use case and aggregate the results. This API abstracts away the specifics of collaborative hosting while still providing product engineers with the information that they need.</p><p id="08eb" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">When naming API endpoints, we aimed to explicitly state the endpoint’s goal to reduce the chances that an engineer might misuse an API response. For example, the endpoint that returns users to notify for a reservation is named <code class="ji jj jk jl jm b">getManagersToNotifyForReservation</code>, instead of <code class="ji jj jk jl jm b">getReservationManagers</code>, which could be mistaken for a permissions endpoint that fetches the list of users that can modify a reservation.</p><h2 id="352f" class="jo ii do bo ij kw kx ky im kz la lb ip lc ld le it lf lg lh ix li lj lk jb ll el"><strong class="ca">Collaborative Hosting Playbook</strong></h2><p id="43e5" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">With our clear structure around how collaborative hosting works across use cases, we can establish concise steps for new product development within the existing framework.</p><p id="bdb5" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><strong class="gw jw">Introducing a New Type of Collaborative Hosting</strong></p><p id="ae7b" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">When a new collaborative hosting use case is introduced, integrating it into Airbnb requires just a few key changes to get most of the way there:</p><ul class=""><li id="a2d6" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr kc ju jv el">Update/add your use case to the Himeji config for permissions</li><li id="87c2" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr kc ju jv el">Update the resource &lt;&gt; user group association system to incorporate your use case</li><li id="7e22" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr kc ju jv el">Update the collaborative hosting API endpoints to incorporate logic from your new use case when considering notifications or display</li></ul><p id="5d5b" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><strong class="gw jw">Introducing a New Airbnb Feature</strong></p><p id="8f47" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">When a new Airbnb feature is introduced, launching it to all Hosts just requires a few steps. Let’s say that Airbnb is building NewProduct. NewProduct will introduce a new type of resource, belos. We would need to:</p><ul class=""><li id="add7" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr kc ju jv el">Set up belo &lt;&gt; user group associations by adding belos to the user group &lt;&gt; resource association system</li><li id="bda8" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr kc ju jv el">Use these belo &lt;&gt; user group associations to create a search index for belos so that we can fetch belos by user group ids</li><li id="cd20" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr kc ju jv el">If needed, add a new endpoint for notifying and displaying users for belos</li></ul><h1 id="81f9" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Conclusion</h1><p id="68b7" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Airbnb has developed a collaborative hosting infrastructure that supports all types of Hosts. This makes it much easier to build products, as engineers just need to know about one central framework that will cover all hosting use cases. Collaborative hosting is critical to the success of many Hosts on Airbnb. A seamless developer experience when building for all Hosts allows us to empower Hosts to deliver great stays to guests.</p><p id="ebb7" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">If this type of work interests you, check out some of our related positions:</p><ul class=""><li id="84f6" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr kc ju jv el"><a href="https://careers.airbnb.com/positions/2921989/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Senior Android Software Engineer, Hosting</a></li><li id="7a94" class="gu gv do gw b gx jx gz ha hb jy hd he hf jz hh hi hj ka hl hm hn kb hp hq hr kc ju jv el"><a href="https://careers.airbnb.com/positions/2809890/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Senior iOS Software Engineer, Hosting</a></li></ul><p id="4f76" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">… and more at <a href="https://careers.airbnb.com/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Careers at Airbnb</a>!</p><h1 id="9e86" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Acknowledgments</h1><p id="f415" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Collaborative hosting is only possible as a result of the work of many incredible and mission-driven people over the years. Special thanks to Yi Lang Mok, Yan Li, Evelyn Shen, Amy Li, Aaron Holsonege, Eric Guan, Sujith Vishwajith, Alan Yao, JD Jiang, Brian Mason, Peggy Zheng, Chuan Shi, Dorothy Chang, Sharlene Luo, Jingyi Ni, Matias Figueroa, Charlie Jiang, Sushu Zhang, Ken Kao, Anna Majkowska, Jessica Tai, and many more.</p><p id="3b2f" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="ls">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/how-airbnb-supports-co-hosting-edfb11d88575</link>
      <guid>https://medium.com/airbnb-engineering/how-airbnb-supports-co-hosting-edfb11d88575</guid>
      <pubDate>Thu, 09 Dec 2021 20:32:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Measuring Web Performance at Airbnb]]></title>
      <description><![CDATA[<div class=""><div class="em"><div class="n en eo ep eq"><div class="o n"><div><a rel="noopener follow" href="https://medium.com/@joshuanelson?source=post_page-----122da8d3ea3f-----------------------------------"><img alt="Joshua Nelson" class="s er es et" src="https://miro.medium.com/fit/c/96/96/1*J2KH6Y4-sDMbUstgVwGzGw.jpeg" width="48" height="48" /></a></div><div class="eu aj s"><div class="n"><div><div class="ew n o ex"><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/@joshuanelson?source=post_page-----122da8d3ea3f-----------------------------------">Joshua Nelson</a><div class="eu n"><div class="fv s"><div><div><div class="ft" role="tooltip" aria-hidden="false"><div class="s"></div></div></div></div></div></div></div></div><div><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/airbnb-engineering/measuring-web-performance-at-airbnb-122da8d3ea3f?source=post_page-----122da8d3ea3f-----------------------------------">Dec 6</a> · 7 min read</div></div></div><div class="n gi gj gk gl gm gn go gp z"><div class="n o"><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="gq s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div><div class="s ap"><div><div class="ft" role="tooltip" aria-hidden="false"></div></div></div></div></div></div></div><p id="70b8" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Learn what web performance metrics Airbnb tracks, how we measure them, and how we consider tradeoffs between them in practice.</p><p id="4c17" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><a href="https://www.linkedin.com/in/joshua-nelson-%E2%9C%A8-a0156523/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Josh Nelson</a></p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ht"><img alt="" class="aj if ig" src="https://miro.medium.com/max/1400/0*-MyZDBHAWNSTbGXG" width="700" height="467" role="presentation" /></div></div></figure><p id="b77f" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">How long did it take for this web page to load? It’s a common question industrywide, but is it the right one? Recently, there has been a shift from using single seconds-based metrics like “page load”, to metrics that paint a more holistic picture of performance, representing the experience from a website user’s perspective. At Airbnb, measuring the web performance that our guests and hosts actually experience is critical. Previously, we described how Airbnb <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">created a Page Performance Score</a> to combine multiple metrics from real users into a single score. In this blog post, we describe the metrics that we consider important on our website and how they relate to industry standards. We also discuss some case studies that moved these metrics, and how they impacted the experience of website visitors.</p><h1 id="abb0" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Web Performance Metrics</h1><p id="be89" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">There are five key performance metrics that we measure on our website. We chose these metrics because they represent performance as our users experience it, and because their definitions are <a href="https://chromium.googlesource.com/chromium/src/+/lkgr/docs/speed/good_toplevel_metrics.md" class="bu hs" target="_blank" rel="noopener ugc nofollow">simple, interpretable, and performant to compute</a>.</p><p id="9b1a" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">We record these metrics both for direct requests to the site, as well as for client side transition requests between pages (Airbnb uses a <a href="https://developer.mozilla.org/en-US/docs/Glossary/SPA" class="bu hs" target="_blank" rel="noopener ugc nofollow">single page app</a> architecture). We will give an overview of these metrics, how we instrument them, and their relative weightings in our overall <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">Page Performance Score</a>.</p><h2 id="7545" class="ji ii do bo ij jj jk jl im jm jn jo ip jp jq jr it js jt ju ix jv jw jx jb jy el">Time To First Contentful Paint</h2><p id="0a40" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Time To First Contentful Paint (<a href="https://web.dev/fcp/" class="bu hs" target="_blank" rel="noopener ugc nofollow">TTFCP</a>) measures the time between the start of navigation and the time at which <strong class="gw jz">anything appears on the screen</strong>. This could be text, a loading spinner, or any visual confirmation to the user that the website has received their request. We use the <a href="https://web.dev/fcp/#measure-fcp-in-javascript" class="bu hs" target="_blank" rel="noopener ugc nofollow">paint timing API</a> for direct requests. For client routed transitions, we have written our own instrumentation that is triggered when a page transition begins:</p><figure class="hu hv hw hx hy hz"><div class="ka s ic"><figcaption class="kd ke cu cs ct kf kg bo b ev bq br"><em class="kh">A simplified version of our FCP polyfill for client transitions</em></figcaption></div></figure><h2 id="3aaf" class="ji ii do bo ij jj jk jl im jm jn jo ip jp jq jr it js jt ju ix jv jw jx jb jy el">Time To First Meaningful Paint</h2><p id="b725" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Time To First Meaningful Paint (TTFMP) measures the time from the start of navigation to the point at which <strong class="gw jz">the most meaningful element appears on the screen</strong>. This is usually the page’s largest image or highest heading. This indicates to a user that useful information has arrived and that they can start consuming the page’s content.</p><p id="a7b6" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">To instrument TTFMP, product engineers tag their page’s meaningful element with an id — we call this the FMP target. We then recursively search for a page’s FMP target.</p><figure class="hu hv hw hx hy hz"><div class="ka s ic"><figcaption class="kd ke cu cs ct kf kg bo b ev bq br"><em class="kh">A simplified version of our TTFMP polyfill</em></figcaption></div></figure><p id="5c19" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">It’s important to note this metric requires <em class="ki">manual instrumentation</em> by our product engineers — every page must include a “FMP-target”, or we’ll never record the first meaningful paint milestone. To ensure that each page instruments TTFMP correctly, we report on how often this element is found on a given page. If it is found less than 80% of the time due either to lack of instrumentation or to conditional rendering of the FMP target, we trigger alerts to warn that the metric is not valid for that page. This requires developers to keep the TTFMP instrumentation up to date through page redesigns, refactors, and A/B tests.</p><p id="9749" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Instrumenting TTFMP automatically is difficult because it is hard to systematically know what element is the most “meaningful” on the page. <a href="https://web.dev/lcp/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Largest Contentful Paint</a> addresses this by measuring the largest element on the page. We do not use Largest Contentful Paint because the <a href="https://developer.mozilla.org/en-US/docs/Web/API/LargestContentfulPaint" class="bu hs" target="_blank" rel="noopener ugc nofollow">browser API</a> for this metric only returns the paint timing for initial load and is not available for client transitions in our single page app. If Largest Contentful Paint could be reset and used for client-side routed transitions too, we would use Largest Contentful Paint as a default that requires no manual instrumentation.</p><h2 id="b943" class="ji ii do bo ij jj jk jl im jm jn jo ip jp jq jr it js jt ju ix jv jw jx jb jy el">First Input Delay</h2><p id="e9cd" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">First Input Delay (<a href="https://web.dev/fid/" class="bu hs" target="_blank" rel="noopener ugc nofollow">FID</a>) measures the time it takes for the browser to <strong class="gw jz">start responding to user interaction</strong>. A low FID signals to the user that the page is usable and responsive. Conversely, anything over 50ms is a <a href="https://developer.mozilla.org/en-US/docs/Web/Performance/How_long_is_too_long" class="bu hs" target="_blank" rel="noopener ugc nofollow">perceptible delay to a user</a>. To support client transitions, we forked the <a href="https://github.com/GoogleChrome/web-vitals" class="bu hs" target="_blank" rel="noopener ugc nofollow">first-input-delay</a> instrumentation from web-vitals to reset the observation of the input delay.</p><h2 id="51bd" class="ji ii do bo ij jj jk jl im jm jn jo ip jp jq jr it js jt ju ix jv jw jx jb jy el">Total Blocking Time</h2><p id="cc50" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Total Blocking Time (<a href="https://web.dev/tbt/" class="bu hs" target="_blank" rel="noopener ugc nofollow">TBT</a>) measures the total sum of time for which <strong class="gw jz">the main thread is “blocked”</strong>. When TBT is high, the page may freeze or stop responding when scrolling or interacting, and animations may be less smooth<strong class="gw jz">. </strong>Tasks that take longer than 50ms are considered “<a href="https://w3c.github.io/longtasks/" class="bu hs" target="_blank" rel="noopener ugc nofollow">long tasks</a>” and contribute to TBT.</p><p id="19e1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">One difficulty with using TBT is that it can be hard to attribute blocking to specific components or sections on our pages. For this reason, we have created a sub-metric we call <em class="ki">interactivity spans, </em>which captures blocking time that occurs within a specified window.</p><p id="9693" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">While we report the <em class="ki">total</em> blocking time, we know that <em class="ki">not all blocking time is equal</em> — time spent blocking user interaction is worse than idle blocking time. Another drawback is that blocking time accumulates indefinitely over the course of the page, which makes the metric hard to collect synthetically, and impacted by session length. We’re investigating how to attribute specific blocking times to user interaction, and will follow the direction of the <a href="https://web.dev/smoothness/" class="bu hs" target="_blank" rel="noopener ugc nofollow">animation smoothness metrics</a> in the web vitals initiative.</p><p id="ed89" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">TBT is <a href="https://developer.mozilla.org/en-US/docs/Web/API/Long_Tasks_API#performancelongtasktiming" class="bu hs" target="_blank" rel="noopener ugc nofollow">currently only available in Chromium-based browsers</a>, and there is no polyfill available. In these cases, we do not report TBT — however, we have found that even with limited browser support, TBT is a useful measurement of post-load performance.</p><h2 id="bbb5" class="ji ii do bo ij jj jk jl im jm jn jo ip jp jq jr it js jt ju ix jv jw jx jb jy el">Cumulative Layout Shift</h2><p id="6d33" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Cumulative Layout Shift (<a href="https://web.dev/cls/" class="bu hs" target="_blank" rel="noopener ugc nofollow">CLS</a>) measures the layout instability that occurs during a page session, weighted both by the size and distance of the element shift. A low CLS indicates to the user that the page is <strong class="gw jz">predictable</strong> and gives them confidence to continue interacting with it.</p><p id="d214" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">CLS is also <a href="https://developer.mozilla.org/en-US/docs/Web/API/LayoutShift#browser_compatibility" class="bu hs" target="_blank" rel="noopener ugc nofollow">not available</a> in every browser we support. While there is no polyfill available, we do not report any value for CLS in those browsers. Similar to TBT, we find even partial browser coverage to be useful, as a shift in Browser A likely also occurs in Browser B.</p><h1 id="a0d5" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Web Page Performance Score</h1><p id="e7dc" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">We combine these scores using the Page Performance Score (PPS) system, described in the <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936">previous post in this series</a>. PPS combines input metrics into a 0–100 score that we use for goal setting and regression detection.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ht"><div class="ka s ic ko"><div class="kp kc s"><div class="kj kk t u v kl aj at km kn"><div class="ftr-noscript"><img alt="" class="t u v kl aj" src="https://miro.medium.com/max/1400/0*3bNL6_zkmtI26zDy" width="700" height="126" srcset="https://miro.medium.com/max/552/0*3bNL6_zkmtI26zDy 276w, https://miro.medium.com/max/1104/0*3bNL6_zkmtI26zDy 552w, https://miro.medium.com/max/1280/0*3bNL6_zkmtI26zDy 640w, https://miro.medium.com/max/1400/0*3bNL6_zkmtI26zDy 700w" role="presentation" /></div></div></div></div></div><figcaption class="kd ke cu cs ct kf kg bo b ev bq br"><em class="kh">A diagram of the relative weightings of input metrics to the PPS score for a given page. </em><a href="https://web.dev/fcp/" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="kh">TTFCP</em></a><em class="kh">: 35%, </em><a href="https://web.dev/fid/" class="bu hs" target="_blank" rel="noopener ugc nofollow">FID</a><em class="kh">: 30%, TTFMP: 15%, </em><a href="https://web.dev/tbt/" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="kh">TBT</em></a><em class="kh">: 15%, </em><a href="https://web.dev/cls/" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="kh">CLS</em></a><em class="kh">: 5%</em></figcaption></div></figure><h1 id="dd3c" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Web Vitals and Lighthouse</h1><p id="20c2" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el"><a href="https://github.com/GoogleChrome/web-vitals" class="bu hs" target="_blank" rel="noopener ugc nofollow">Web Vitals</a> and <a href="https://developers.google.com/web/tools/lighthouse" class="bu hs" target="_blank" rel="noopener ugc nofollow">Lighthouse</a> are large sources of inspiration and research for our implementation of PPS on the web.</p><p id="7b54" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Lighthouse is a tool that rates a web page by running synthetic tests, auditing, and scoring the page. However, Lighthouse runs these tests synthetically, while PPS scores pages according to real user metrics. Lighthouse is a powerful diagnostic tool, while PPS lets us use real user metrics for goal setting and regression detection.</p><p id="8db1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Web Vitals is a library that measures real user metrics, similar to PPS. However, it does not include a numerical scoring system similar to PPS or Lighthouse, and it does not yet account for client transitions inside a Single Page Application. We do make use of web vitals by including and prioritizing similar metrics to ensure that the direction of PPS and Web Vitals are aligned.</p><h1 id="84d3" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Early Flush Case Study</h1><p id="6a99" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">When making changes to improve performance, we often run A/B tests to gather data on how successful our improvements were. Ideally, we would strictly improve performance by improving one or more of the metrics described previously. However, we sometimes see examples where one metric has improved at the expense of another. The PPS system streamlines decision making when considering tradeoffs.</p><p id="3f09" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">As an example, on pages that have dynamic content (such as our listing pages), we previously CDN cached a generic version of the page that contained a loading state, leading to a fast TTFCP. We then ran an experiment to flush HTML content from the server early and skip this initial loading state.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct kt"><div class="ka s ic ko"><div class="ku kc s"><div class="kj kk t u v kl aj at km kn"><div class="ftr-noscript"><img alt="" class="t u v kl aj" src="https://miro.medium.com/max/1400/1*bXFjV4d-JKnNmRFa6gK0_Q.png" width="700" height="236" srcset="https://miro.medium.com/max/552/1*bXFjV4d-JKnNmRFa6gK0_Q.png 276w, https://miro.medium.com/max/1104/1*bXFjV4d-JKnNmRFa6gK0_Q.png 552w, https://miro.medium.com/max/1280/1*bXFjV4d-JKnNmRFa6gK0_Q.png 640w, https://miro.medium.com/max/1400/1*bXFjV4d-JKnNmRFa6gK0_Q.png 700w" role="presentation" /></div></div></div></div></div><figcaption class="kd ke cu cs ct kf kg bo b ev bq br"><em class="kh">Left: Before, CDN cached — shimmering skeleton loading state. Right: After, Early flushed page, including the first meaningful paint image.</em></figcaption></div></figure><p id="0261" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The result of this experiment was a slower TTFCP without the CDN, but a faster TTFMP because we skip the initial loading state. Though we weight TTFCP higher than TTFMP, we found that the magnitude of improvement in TTFMP outweighed the regression in TTFCP and shipped the change. This type of decision is simple to make when we have a Web Page Performance Score to help us consistently evaluate tradeoffs.</p><h1 id="77fb" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Summary</h1><p id="bc09" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">We have seen through experimentation that these metrics correlate with positive user experience changes. Web PPS gives us a single score we can use for goal setting and regression detection, while also capturing many different aspects of user experience: paint timings, interactivity and layout stability. We hope that Web PPS can be used as a reference for implementing similar systems outside of Airbnb.</p><p id="70da" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Our deepest thanks go out to our industry colleagues working on performance — as the industry evolves Web PPS will also evolve.</p><p id="6075" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Thanks to <a href="https://www.linkedin.com/in/lupinglin/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Luping Lin</a>, Victor Lin, <a href="https://www.linkedin.com/in/gabe-lyons-9a574543/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Gabe Lyons</a>, <a href="https://www.linkedin.com/in/nickbryanmiller/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Nick Miller</a>, <a href="https://www.linkedin.com/in/hdezninirola/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Antonio Niñirola</a>, <a href="https://www.linkedin.com/in/adityapunjani/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Aditya Punjani</a>, <a href="https://www.linkedin.com/in/guy-rittger-%E2%93%A5-1355b4/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Guy Rittger</a>, <a href="https://www.linkedin.com/in/scheuermann/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Andrew Scheuermann</a>, <a href="https://www.linkedin.com/in/jnvollmer/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Jean-Nicolas Vollmer</a>, and <a href="https://www.linkedin.com/in/xiaokangxin/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Xiaokang Xin</a> for their contributions to this article and to PPS.</p><p id="f1af" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="ki">All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.</em></p></div></div></div></div></div>]]></description>
      <link>https://medium.com/airbnb-engineering/measuring-web-performance-at-airbnb-122da8d3ea3f</link>
      <guid>https://medium.com/airbnb-engineering/measuring-web-performance-at-airbnb-122da8d3ea3f</guid>
      <pubDate>Mon, 06 Dec 2021 20:05:00 +0100</pubDate>
    </item>
    <item>
      <title><![CDATA[Creating Airbnb’s Page Performance Score]]></title>
      <description><![CDATA[<section class="dg dh di dj dk"><div class="n p"><div class="ab ac ae af ag dl ai aj"><div class=""><div class="em"><div class="n en eo ep eq"><div class="o n"><div><a rel="noopener follow" href="https://medium.com/@a15n?source=post_page-----5f664be0936-----------------------------------"><img alt="Andrew Scheuermann" class="s er es et" src="https://miro.medium.com/fit/c/96/96/1*NI3JLWXkFfZhrpaOO81Kew.png" width="48" height="48" /></a></div><div class="eu aj s"><div class="n"><div><div class="ew n o ex"><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/@a15n?source=post_page-----5f664be0936-----------------------------------">Andrew Scheuermann</a><div class="eu n"><div class="fv s"><div><div><div class="ft" role="tooltip" aria-hidden="false"><div class="s"></div></div></div></div></div></div></div></div><div><a class="bu bv bw bx by bz ca cb ba cc fe cf cg ch" rel="noopener follow" href="https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936?source=post_page-----5f664be0936-----------------------------------">Nov 18</a> · 6 min read</div></div></div><div class="n gi gj gk gl gm gn go gp z"><div class="n o"><div class="gq s ap"><div class="gq s ap"><div class="gq s ap"><div class="s ap"></div></div></div></div></div><p id="f128" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Learn how Airbnb built the Page Performance Score, a 0–100 score that measures multiple performance metrics from real users on any platform.</p><p id="dedc" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><a href="https://www.linkedin.com/in/scheuermann/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Andrew Scheuermann</a></p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct ht"><img alt="Two men playing guitar, one main playing an oboe." class="aj if ig" src="https://miro.medium.com/max/1400/0*Mipmtj8Fk9XmyBMK" width="700" height="467" /></div></div></figure><p id="c6a6" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Performance is important at Airbnb and part of our <a class="bu hs" rel="noopener" href="https://medium.com/airbnb-engineering/commitment-to-craft-e36d5a8efe2a">Commitment to Craft</a>. A fast experience is good for business and critical to our mission to “create a world where anyone can belong anywhere”.</p><p id="75c3" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Before we can create a fast experience we need to agree on what “fast” measures. Web, iOS, and Android each have different platform-specific performance metrics. For product engineers it can be challenging to understand which of these metrics to prioritize, and for management it’s difficult to compare platforms and keep progress reports succinct.</p><p id="62de" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">We’ve developed a new performance measurement system called the Page Performance Score that allows us to track multiple performance metrics from real customers across different platforms with ease. This post describes that system, and in the following weeks we’ll be publishing deep dives into the specifics for Web, iOS, and Android.</p><h1 id="6225" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Early Performance Measurement Efforts</h1><p id="0b4b" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">When Airbnb first started measuring performance, we used a single metric called “Time To Airbnb Interactive” (TTAI) that measured the time from page start to when content became visible and interactive. This approach had many positive outcomes. We built performance tracking architecture, fixed latency issues, and cultivated a company culture that valued performance.</p><p id="f827" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">However, TTAI also had shortcomings. Different platforms had different baselines and goals. Page comparisons were difficult because the “interactive” definition could change between similar pages. In some situations TTAI improved but engagement metrics did not. Most importantly, TTAI was a single metric and a single metric cannot capture the full spectrum of our customers’ performance expectations. Our definition of “fast” was incomplete and limited our overall performance efforts.</p><blockquote class="ji jj jk"><p id="5db6" class="gu gv jl gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">A single metric cannot capture the full spectrum of our customers’ performance expectations.</p></blockquote><h1 id="2a09" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Introducing the Page Performance Score</h1><p id="7410" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">We needed a nuanced view of performance while maintaining the simplicity of tracking a single number, so we created the Page Performance Score (PPS).</p><ul class=""><li id="3d22" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jm jn jo el"><strong class="gw jp">Page</strong>: The entire customer journey on Airbnb is divided into different pages.</li><li id="43a2" class="gu gv do gw b gx jq gz ha hb jr hd he hf js hh hi hj jt hl hm hn ju hp hq hr jm jn jo el"><strong class="gw jp">Performance</strong>: A page contains multiple performance metrics.</li><li id="87b0" class="gu gv do gw b gx jq gz ha hb jr hd he hf js hh hi hj jt hl hm hn ju hp hq hr jm jn jo el"><strong class="gw jp">Score</strong>: Every day, on each platform, we formulate a given page’s performance data into a 0–100 score.</li></ul><p id="8cbe" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">PPS allows us to combine multiple input metrics into an easily comparable score. PPS is a step-function improvement over our prior single-metric approach.</p><h1 id="1b8d" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">The Metrics</h1><p id="1df5" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">The metrics that we measure differ by platform, but the general approach of measuring multiple metrics and formulating a 0–100 score is the same. All of the metrics are <a href="https://web.dev/user-centric-performance-metrics/" class="bu hs" target="_blank" rel="noopener ugc nofollow">user-centric</a> and fall into two general categories:</p><ol class=""><li id="d31e" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr jv jn jo el"><strong class="gw jp">Initial Load Metrics</strong> measure the time from “page start” to content visible.</li><li id="457e" class="gu gv do gw b gx jq gz ha hb jr hd he hf js hh hi hj jt hl hm hn ju hp hq hr jv jn jo el"><strong class="gw jp">Post Load Metrics</strong> measure page responsiveness after the initial load.</li></ol><figure class="hu hv hw hx hy hz cw jx fs jy jz ka kb kc cb kd ke kf kg kh ki paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct jw"><div class="ko s ic kp"><div class="kq kr s"><div class="kj kk t u v kl aj at km kn"><div class="ftr-noscript"><img alt="The Airbnb app opens, shows a loader, then the final meaningful page content." class="t u v kl aj" src="https://miro.medium.com/max/1000/0*BdTy3tjZIYtwtc2S" width="500" height="1055" srcset="https://miro.medium.com/max/552/0*BdTy3tjZIYtwtc2S 276w, https://miro.medium.com/max/1000/0*BdTy3tjZIYtwtc2S 500w" /></div></div></div></div></div><figcaption class="kv kw cu cs ct kx ky bo b ev bq br"><em class="kz">The Airbnb homepage displays the loader and then meaningful content.</em></figcaption></div></figure><h2 id="112b" class="la ii do bo ij lb lc ld im le lf lg ip lh li lj it lk ll lm ix ln lo lp jb lq el">Initial Load Metrics</h2><p id="0aa6" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el"><strong class="gw jp">Time To First Contentful Paint</strong> (Web) and <strong class="gw jp">Time To First Layout</strong> (Native) measure the time from “page start” until the first piece of content is visible, which is commonly a loader.</p><p id="197b" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><strong class="gw jp">Time To First Meaningful Paint</strong> (Web) and <strong class="gw jp">Time To Initial Load</strong> (Native) measure the time from “page start” until the meaningful content is displayed.</p><p id="c6ef" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Initial Load Metrics are visualized on the left.</p><h2 id="4a78" class="la ii do bo ij lb lc ld im le lf lg ip lh li lj it lk ll lm ix ln lo lp jb lq el">Post Load Metrics</h2><p id="f6e7" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el"><a href="https://web.dev/fid/" class="bu hs" target="_blank" rel="noopener ugc nofollow"><strong class="gw jp">First Input Delay</strong></a> (Web) measures the delay between user interaction and when the browser begins to respond. Delays of 50ms or longer are <a href="https://developer.mozilla.org/en-US/docs/Web/Performance/How_long_is_too_long" class="bu hs" target="_blank" rel="noopener ugc nofollow">perceptible to the user</a>.</p><p id="7c29" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><a href="https://web.dev/tbt/" class="bu hs" target="_blank" rel="noopener ugc nofollow"><strong class="gw jp">Total Blocking Time</strong></a> (Web) and <strong class="gw jp">Thread Hangs</strong> (Native) cause the app to lag during layout, animations, and scrolling.</p><p id="11a3" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><strong class="gw jp">Additional Load Time</strong> (Native) measures the average time that additional loaders are displayed within a page, such as during pagination.</p><p id="c9cc" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><strong class="gw jp">Rich Content Load Time</strong> (Native) measures the average time for images and videos to load.</p><p id="8d10" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><a href="https://web.dev/cls/" class="bu hs" target="_blank" rel="noopener ugc nofollow"><strong class="gw jp">Cumulative Layout Shift</strong></a> (Web) measures layout instability weighted by the size and distance of the element shift.</p><h1 id="fcbe" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">The Formula</h1><p id="cf25" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">After measuring the metrics we distill that information into a single number using the PPS Formula, which was forked from the <a href="https://web.dev/performance-scoring/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Lighthouse Formula</a>. For each metric we identified Good, Moderate, and Poor thresholds based on internal and <a href="https://web.dev/defining-core-web-vitals-thresholds/" class="bu hs" target="_blank" rel="noopener ugc nofollow">industry data</a>. We created a scoring curve by assigning the Good range a score above 0.7, the Poor range below 0.5, and the Moderate range in between.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct lr"><div class="ko s ic kp"><div class="ls kr s"><div class="kj kk t u v kl aj at km kn"><div class="ftr-noscript"><img alt="A log normal curve with X values from 0 to 1, and Y values from 0 to 100,000." class="t u v kl aj" src="https://miro.medium.com/max/1400/0*6RlR2VE-YdrQVLqs" width="700" height="543" srcset="https://miro.medium.com/max/552/0*6RlR2VE-YdrQVLqs 276w, https://miro.medium.com/max/1104/0*6RlR2VE-YdrQVLqs 552w, https://miro.medium.com/max/1280/0*6RlR2VE-YdrQVLqs 640w, https://miro.medium.com/max/1400/0*6RlR2VE-YdrQVLqs 700w" /></div></div></div></div></div><figcaption class="kv kw cu cs ct kx ky bo b ev bq br"><em class="kz">A 10,000ms metric value would score ~0.9 in this example curve.</em></figcaption></div></figure><p id="60c6" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Every day we calculate a given page’s metric’s <a href="https://en.wikipedia.org/wiki/Truncated_mean" class="bu hs" target="_blank" rel="noopener ugc nofollow">capped average</a> value from millions of real-user page loads. We map that capped average value against the metric’s curve to get a 0–1 score. We combine the metric scores into a composite PPS score by multiplying the metric scores by the metric weights. We chose the weights by examining our performance-focused A/B tests and ensuring that the weights are maximally aligned with Airbnb’s internal engagement metrics.</p><h2 id="48e7" class="la ii do bo ij lb lc ld im le lf lg ip lh li lj it lk ll lm ix ln lo lp jb lq el">Web Metric Weights</h2><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct lt"><div class="ko s ic kp"><div class="lu kr s"><div class="kj kk t u v kl aj at km kn"><div class="ftr-noscript"><img alt="A percentage stacked bar chart with values TTFCP 35%, TTFMP 15%, FID 30%, TBT 15%, and CLS 5%." class="t u v kl aj" src="https://miro.medium.com/max/1400/0*j1F-huoAevfFGkEF" width="700" height="97" srcset="https://miro.medium.com/max/552/0*j1F-huoAevfFGkEF 276w, https://miro.medium.com/max/1104/0*j1F-huoAevfFGkEF 552w, https://miro.medium.com/max/1280/0*j1F-huoAevfFGkEF 640w, https://miro.medium.com/max/1400/0*j1F-huoAevfFGkEF 700w" /></div></div></div></div></div></div></figure><h2 id="c682" class="la ii do bo ij lb lc ld im le lf lg ip lh li lj it lk ll lm ix ln lo lp jb lq el">Native Metric Weights</h2><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct lv"><div class="ko s ic kp"><div class="lw kr s"><div class="kj kk t u v kl aj at km kn"><div class="ftr-noscript"><img alt="A percentage stacked bar chart with values TTFL 10%, TTIL 50%, TH 10%, ALT 15%, and RCLT 15%." class="t u v kl aj" src="https://miro.medium.com/max/1400/0*jSbbMQYWVEIGUycR" width="700" height="94" srcset="https://miro.medium.com/max/552/0*jSbbMQYWVEIGUycR 276w, https://miro.medium.com/max/1104/0*jSbbMQYWVEIGUycR 552w, https://miro.medium.com/max/1280/0*jSbbMQYWVEIGUycR 640w, https://miro.medium.com/max/1400/0*jSbbMQYWVEIGUycR 700w" /></div></div></div></div></div></div></figure><p id="06ec" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">The resulting PPS formula can be expressed as….</p><p id="a2e7" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="jl">PPS = curve(metric_1) * weight_1 + curve(metric_2) * weight_2 …</em></p><p id="b06d" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">For example, on Web….</p><p id="32a1" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="jl">PPS = curve(TTFCP) * 35% + curve(TTFMP) * 15% + curve(FID) * 30% + curve(TBT) * 15% + curve(CLS) * 5%</em></p><h2 id="b261" class="la ii do bo ij lb lc ld im le lf lg ip lh li lj it lk ll lm ix ln lo lp jb lq el">PPS Evolutions</h2><p id="fb72" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Migrating the company from a single metric to PPS was organizationally challenging. We had to train the company to stop viewing performance as a single seconds-based number, which is a paradigm shift that requires cross functional alignment. To help ease the transition we mapped the old TTAI ranges with the new PPS ranges.</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct lx"><div class="ko s ic kp"><div class="ly kr s"><div class="kj kk t u v kl aj at km kn"><div class="ftr-noscript"><img alt="A table with the following values: Good Speed equals TTAI less than 3 seconds and also equals PPS greater than 70; Average Speed equals TTAI 3 to 5 seconds and also equals PPS 50 to 70; Slow Speed equals TTAI above 5 seconds and also equals PPS less than 50." class="t u v kl aj" src="https://miro.medium.com/max/1400/0*eaCWefUEjUyvHvfH" width="700" height="395" srcset="https://miro.medium.com/max/552/0*eaCWefUEjUyvHvfH 276w, https://miro.medium.com/max/1104/0*eaCWefUEjUyvHvfH 552w, https://miro.medium.com/max/1280/0*eaCWefUEjUyvHvfH 640w, https://miro.medium.com/max/1400/0*eaCWefUEjUyvHvfH 700w" /></div></div></div></div></div></div></figure><p id="387c" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Once the company understood PPS, improving on it was comparatively easy. We simply add or replace metrics as our understanding of performance improves and the 0–100 score remains constant. PPS was designed to evolve. For example, in 2019 the Chrome team introduced <a href="https://web.dev/cls/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Cumulative Layout Shift</a>, which was a perfect candidate for Web PPS. It was a user-centric metric, had good browser coverage, and could be measured on direct and client-routed page loads. We instrumented the metric, validated the data, and then incorporated it into the next version of PPS. Easy!</p><h1 id="f508" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Weighted Average Score</h1><p id="400b" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">In addition to tracking individual pages’ PPS scores we track the entire organization’s overall performance progress by creating a Weighted Average Score (WAS). Consider these example PPS scores and traffic for three common pages:</p><figure class="hu hv hw hx hy hz cs ct paragraph-image"><div role="button" tabindex="0" class="ia ib ic id aj ie"><div class="cs ct lz"><div class="ko s ic kp"><div class="ma kr s"><div class="kj kk t u v kl aj at km kn"><div class="ftr-noscript"><img alt="" class="t u v kl aj" src="https://miro.medium.com/max/1400/1*eRUXs4sMKnOixViNn08zCA.png" width="700" height="133" srcset="https://miro.medium.com/max/552/1*eRUXs4sMKnOixViNn08zCA.png 276w, https://miro.medium.com/max/1104/1*eRUXs4sMKnOixViNn08zCA.png 552w, https://miro.medium.com/max/1280/1*eRUXs4sMKnOixViNn08zCA.png 640w, https://miro.medium.com/max/1400/1*eRUXs4sMKnOixViNn08zCA.png 700w" role="presentation" /></div></div></div></div></div></div></figure><p id="bd39" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="jl">(73 * 5,000,000 + 84 * 20,000,000 + 75 * 10,000,000) / 35,000,000 = ~80</em></p><p id="43ef" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">If these were the only pages at Airbnb our WAS would be ~80. Airbnb has hundreds of pages so a WAS helps us prioritize and proportionally weight the most high-traffic pages.</p><h1 id="2ff5" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Conclusion</h1><p id="576d" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">With PPS our engineers and data scientists now have a multitude of user-centric performance metrics to understand and improve their products. We can clearly compare the performance progress of different pages, different organizations, and even different platforms. PPS allows teams to set simple goals and determine which individual metrics to prioritize. PPS can evolve: metrics can be replaced, weights can change, targets can tighten, and yet the 0–100 score remains constant.</p><p id="9bf3" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el">Changing our definition of “fast” has been well worth the effort. The company has evolved from viewing performance as a single metric to a 0–100 score that represents the rich, complex realities of performance. We have leveled up our performance measurement system and hope that you apply these learnings in your organization as well. If this type of work interests you, check out some of our related positions at <a href="https://careers.airbnb.com/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Careers at Airbnb</a>!</p><h1 id="7db3" class="ih ii do bo ij ik il gz im in io hd ip iq ir is it iu iv iw ix iy iz ja jb jc el">Acknowledgments</h1><p id="8ac1" class="gu gv do gw b gx jd gz ha hb je hd he hf jf hh hi hj jg hl hm hn jh hp hq hr dg el">Thank you to the everyone who has helped build PPS over the years: <a href="https://www.linkedin.com/in/adityapunjani/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Aditya Punjani</a>, <a href="https://www.linkedin.com/in/alperkokmen/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Alper Kokmen</a>, <a href="https://www.linkedin.com/in/hdezninirola/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Antonio Niñirola</a>, <a href="https://www.linkedin.com/in/ben-weiher-123088122/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Ben Weiher</a>, <a href="https://www.linkedin.com/in/charlesx2013/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Charles Xue</a>, <a href="https://www.linkedin.com/in/egor-pakhomov-35179a3a/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Egor Pakhomov</a>, <a href="https://www.linkedin.com/in/eli-hart-54a4b975/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Eli Hart</a>, <a href="https://www.linkedin.com/in/elliotsachs/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Elliot Sachs</a>, <a href="https://www.linkedin.com/in/gabe-lyons-9a574543/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Gabe Lyons</a>, <a href="https://www.linkedin.com/in/guy-rittger-%E2%93%A5-1355b4/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Guy Rittger</a>, <a href="https://www.linkedin.com/in/jnvollmer/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Jean-Nicolas Vollmer</a>, <a href="https://www.linkedin.com/search/results/all/?keywords=joshua%20nelson%20%E2%9C%A8&amp;origin=RICH_QUERY_SUGGESTION&amp;position=0&amp;searchId=959d4aca-c80e-448a-b415-4a732ba7a84d&amp;sid=Rr6" class="bu hs" target="_blank" rel="noopener ugc nofollow">Josh Nelson</a>, <a href="https://www.linkedin.com/in/joshpolsky/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Josh Polsky</a>, <a href="https://www.linkedin.com/in/lupinglin/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Luping Lin</a>, <a href="https://www.linkedin.com/in/markgiangreco/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Mark Giangreco</a>, <a href="https://www.linkedin.com/in/mattschreinerphd/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Matt Schreiner</a>, <a href="https://www.linkedin.com/in/nickbryanmiller/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Nick Miller</a>, <a href="https://www.linkedin.com/in/thenickreynolds/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Nick Reynolds</a>, <a href="https://www.linkedin.com/in/noahsmartin/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Noah Martin</a>, <a href="https://www.linkedin.com/in/xiaokangxin/" class="bu hs" target="_blank" rel="noopener ugc nofollow">Xiaokang Xin</a>, and everyone else who helped along the way.</p></div></div></div></div></div></div></div></section><div class="n p em mb mc kd" role="separator"><section class="dg dh di dj dk"><div class="n p"><div class="ab ac ae af ag dl ai aj"><p id="8283" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><em class="jl">Interested in joining Airbnb? Check out these roles:</em></p><p id="9e9e" class="gu gv do gw b gx gy gz ha hb hc hd he hf hg hh hi hj hk hl hm hn ho hp hq hr dg el"><a href="https://grnh.se/feab9b481us" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="jl">Android Software Engineer, Guest Experience</em></a><em class="jl"><br /></em><a href="https://grnh.se/cbf480fa1us" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="jl">Senior iOS Software Engineer, Guest Experience</em></a><em class="jl"><br /></em><a href="https://grnh.se/23eb3d8b1us" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="jl">Senior Android Software Engineer, Guest Experience</em></a><em class="jl"><br /></em><a href="https://grnh.se/4092a3ba1us" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="jl">Staff iOS Software Engineer, Guest Experience</em></a><em class="jl"><br /></em><a href="https://grnh.se/a6f52fb91us" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="jl">Staff Android Software Engineer, Guest Experience</em></a><em class="jl"><br /></em><a href="https://grnh.se/0bcae9dd1us" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="jl">Senior Software Engineer, Guest Experience</em></a><em class="jl"><br /></em><a href="https://grnh.se/7d47f8ce1us" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="jl">Staff Fullstack Engineer, Guest Experience</em></a><em class="jl"> <br /></em><a href="https://grnh.se/e80a733a1us" class="bu hs" target="_blank" rel="noopener ugc nofollow"><em class="jl">Senior Data Scientist — Analytics Engineering, Guest Experience</em></a></p></div></div></section></div>]]></description>
      <link>https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936</link>
      <guid>https://medium.com/airbnb-engineering/creating-airbnbs-page-performance-score-5f664be0936</guid>
      <pubDate>Thu, 18 Nov 2021 20:57:00 +0100</pubDate>
    </item>
  </channel>
</rss>
