A Three-Pronged View of AI Transformation - Part III

Feb 2, 2021

This is the last of our three part article series - You can refer to Part I and Part II here.

‍The Element of Repeatability

Love the Drawing Board

At the outset of our first large-scale revamp of the people-to-meet algorithm, we encountered a massive roadblock. Adding more context to the situation and stage we were in: this was a revamp scheduled for the core USP of our product. Being the piece of the puzzle that brought in most value, this was a part that needed extreme sensitivity. One of the hardest things to think through while building the revamp was to not have it shift too far from the existing way it functioned since that had become a routine way of operating for end-users.

It took us months of planning and preparing between a variety of cross-functional teams ranging from product and tech to design and people-science to release and make sure everything was airtight. Nearing the end of all code-related and product development, we found a fundamental miss in the way we designed the current algorithm: it didn't give line managers enough to act well—no talking points.

In the bid to increase accuracy twofold, we forgot that the critical element that made our earlier algorithm better was not the deep tech that was associated with its creation, but the fact that it was extremely interpretable. Have a problem with your manager? Speak to Amber about it, and she would rally it to the relevant HRBP, who would then have a 1:1 conversation about an employee's pain points and know exactly where to start, finish, and how to guide the conversation when it goes sideways. In these cases, HRBPs even knew what drove those individual employees previously and what continues to motivate them the most, which made negotiation even easier. The context was vital, not the tech. Comparing this to what we now built, a piece of tech that was incredibly accurate in predicting the likelihood of attrition basis a multitude of data points ranging from cultural nuances in their location to their trends in conversation with Amber, but lacked talking points to aid an HRBP before they began a conversation with an employee who may be likely to leave. What used to be "Hey, I know you're upset with your manager, and these are three things I could suggest to make your experience better", now had the potential to become "Hey, a particular system told me you're upset and likely to leave. Could you give me more context on why this says so?" The cold-start problem would have been created on this front if we deployed it as is, without adding the explainability that we later did. The ones likely to leave would bluff their way through this HRBP intervention, and the ones that weren't would lose all trust in the systems.

I've noticed that bold and radical AI initiatives aren't the ones that gain consensus during decision making. Someone will always have a problem with a particular aspect of the initiative, and that could be a good sign that tells you that you aren't lowballing. On the other hand, the decision that we had to redesign parts of our tech and add explainability to the algorithm almost came as a consensus. Spending an additional month and falling short of the deadlines we had set for ourselves had never felt better, because some form of failure is inevitable in many places, and most importantly, in first-time AI initiatives. This is also where a culture of experimentation kicks in and fundamentally increases the risk-taking ability of individuals and the organization at large, allowing for and fostering innovation. Let your team validate facts based on a variety of experiments that they run, not by firmly held beliefs and biases.

What came out front and center from this experience was that bouncing back from failure was at the heart of our redeployment. In particular, if you're a first time AI executive about to go out to the market with your initiative, you may want to give enough room for the initiative to be experimental. Many times, especially with problems that haven't been solved by the mass-market, you will run into roadblocks that won't have out-of-the-box and straightforward ways to avoid. You may be hitting the drawing board multiple times frustratingly, even nearing the end of some cycles. You could very well have been solving the wrong problem or have had the wrong solution to the right problem. Setting the tone with the team early on will help with faster bounce-backs and team morale because a lot of things can go wrong in the worst of times. An initiative or a problem looked like it could be solved in a particular way, but nearing the end of the cycle, may seem like it cannot.

Encourage starting small, with a lens of the broader view and break it down into smaller subproblems or components of the larger piece and run parallel experiments. This approach is usually failsafe and helps in your failures being smaller parts that are repairable faster, and not the entire initiative all by itself. Reiterating the importance of team morale on this front, an AI initiative fails multiple times and requires countless iterations even before it's presented to an internal audience.

Measure Movement, don’t Target it

One of the worst things you could do with your initiative is to "deploy and forget." A team of brilliant data scientists come together to build a fascinating product, work for months on it together, and release it. Eventually, the team may feel they're good to go with whatever they put out, and it may be sufficient enough to maintain the same state for the longest time, but that could be incredibly detrimental.

Once your product strategy and analytics flow into the start of this initiative, there are two distinct aspects of how to measure progress the right way and when to know that it could be time for a change. The first is during your initial experimentation phase. When you're about to roll out a new initiative, you are bound to have some numbers in mind, and a visual image of what success looks like to you. Try to have dynamic metrics that reflect progress or movement more frequently – having metrics that show weekly or monthly changes during your experimentation phases may slow you down. Seeing early trends in your first experiment helps by keeping the motivation high and giving you an on-ground reality check.

The second, measure progress with changing environments (or elapsed time). We saw how a lexicon-based sentiment analysis model, when exposed to a different set of individuals with a different understanding of slangs and local language, dipped in efficiency substantially and prompted us to start building a new lexicon on top of the older one. Similarly, (at the time of writing) since AI models are commonly trained on a set of older data and then exposed to similar kinds of data, there may be newer instances where the model isn't trained well enough and could falter. This is where your metric will alert you and keep you aware of the latest nuances that your initiative may be encountering.

As Charles Goodhart was phrased by Marilyn Strathern: “When a measure becomes a target, it ceases to be a good measure”. Alongside tracking your metrics, it is imperative that your measure of success be used as a measure, and not a target. As soon as you turn a metric into a goal, its usefulness decreases.

This effect is demonstrated by an old and possibly fictional Soviet-era nail factory example.

Once upon a time, there was a factory in the Soviet Union that made nails. Unfortunately, Moscow set quotas on their nail production, and they began working to meet the quotas as described, rather than doing anything useful. When they set quotas by quantity, they churned out hundreds of thousands of tiny, useless nails. When Moscow realized this was not useful and set a quota by weight instead, they started building big, heavy railroad spike-type nails that weighed a pound each.

In addition to understanding what to track, it is useful to know what is outside the scope of your product. If you're building a food-ordering system, it could be insightful to track repeat orders to help you understand the efficacy of your system. Although, trying to find data in your product for external stimuli uninfluenced by your product is both incorrect and unfounded. If suddenly a large chunk of individuals, possibly due to a religious festival, stop ordering food from your system during the day, there's no need to start a company-wide initiative to restore day-time orders from them thinking that it’s a fault of your system. You can't easily track or fix what you don't directly influence.

The Analytics Network Side Effects

A network effect is when the value of a particular product or service increases as more users start to use it. In a similar bandwagon fashion, as I described earlier, most people that don't already value intelligent analytics will begin to do so once they see it in action helping teams exponentially. To be comfortably prepared for once that time comes, and it does happen if you've covered the other steps well, you need to make sure your micro-teams are empowered enough not to be bottlenecked during this time. An exciting example of this is with an early-stage start-up that I was advising. The organization started off with a small data science team of two individuals supporting data requests across the board and working on analyses in parallel. With five distinct functions in the company at the time in Customer Support, Sales, Product, Marketing, and Engineering, only the sales leader was incredibly data-driven. From sales funnels to lead-conversion propensity scores, the individual had done and seen it all and was using the resource to its fullest by getting all the data he could and using it to drive strategy and execution both. From everyone else (mostly the ones who didn't understand why he spent extra time on irrelevant aspects of implementation when the ABC of sales was still Always Be Closing), he garnered flak for being a slow and unnecessarily thoughtful executioner. As time passed, his data-backed strategic and execution decisions turned out to be right in many aspects. This first led the marketing function to explore what they could do with their data and pass on the right leads to sales, and slowly created an aura around the sales leader's data strategy. Suddenly, everyone wanted to get a piece of the pie and using "data" to drive decisions became watercooler talk. Multiple functions started speaking to the sales leader to understand how to leverage the data they were generating and take intelligent decisions. All of this trickled down to two outcomes – bottlenecked data teams and unintelligent mass hoarding of the resource. Both these outcomes are inevitable if your organization functions as this one did. Even if it doesn't, there's always room to be leaner by being prepared for these two outcomes beforehand, since they may hit you out of the blue someday.

The first one, or the creation of bottlenecks, arises primarily when the organization makes a rapid shift to try and be data driven. This could be like a switch, pulling most of the resource suddenly when the switch is flicked. As the previous example demonstrated, the small data science team of two now has an abundant amount of load (5x of what they did before) to deliver analyses and gather insights to power each of the teams with domain-specific intelligence. As much as that could still find a way to keep going, the last thing you want from your data science teams during this phase is attrition. Additionally, even if attrition is not one of your problems, your leaders are facing a lag in terms of decision making and heavy data dependency from someone that isn't part of their micro-org and who may not understand the particular domain well. This could result in slower and less-contextual reports coming through to the team. One workaround to avoiding this problem is to control the flow of the "switch". As part of the senior management, you could encourage all teams to limit the flow of requests that come through to the data science teams, and assign certain quotas every week for all teams to internally align on their priorities and then pass those on to the data teams. This is good practice regardless of what you're trying to accomplish in the organization, by helping functional teams think through their requests and double-clicking on their priorities before they get on the wheel.

To work towards better execution strategies and set up for scale, many organizations follow data strategies contextual to them. I've observed some of the best ones that pass through the scale barrier have distinct silos of execution and one central governance team that dictates best practices and is responsible for data strategy and sanity. It's also important to note that this particular model may not apply to all contexts. Although, this strategy creates information silos (can be avoided by maintaining robust documentation and intra-team discussions). As an example, a data team with five members may all report into the same individual but work individually with the product, sales, marketing, and other teams. When this structure is set up, it speeds up execution by keeping context switching to a minimum and allows domain experts of the particular problem to work without external interruptions either. You lose time when a marketing data scientist starts to work with finance–different data, viewpoints, and analytics take time to grasp. Additionally, your resource bottlenecks are highlighted for you to manage accordingly. When your sales data scientist has less bandwidth, you can work around it by mixing up your upcoming initiatives and lining up the ones that matter first. On the other hand, when only the central team executes, your will to push your work up the queue will have trouble when you try explaining to stakeholders why your work is more important than theirs.

Unintelligent mass hoarding of data science resources is a peculiar problem of over-engineering situations. Best described as the "herd mentality", when one team shows compounding value from data science efforts, all the others rush in to ask the specialized teams to "figure something out with this data". More often than not, aimless hypotheses are put forth without guidance on how end-users or teams that require this data plan to use it. This takes us back to the start of the loop where data science teams are considered "magicians", and the rest assume that they'll figure something out and tell us what to do. In a centralized team structure, this would also hurt the chances of other teams trying to get time from the specialized resource for their otherwise well-structured and thought out plans.

AI as a Journey

When we created our first sentiment-analysis model in house, we never knew how far ahead we were planning to go with it, or where the end was for us in this particular aspect of the product: a fair idea, but nothing concrete. Was it going to be the final deployment in this space? Had we built an initiative that could withstand the test of time, and would require no further maintenance or product addition? A lot of these questions came to mind when we completed our work on that front, and just when we thought that it was good to go, we found ourselves circling back to add enhancements. What was a quarter's work for two data scientists, quickly spanned over half a year as the scope of the work kept increasing (for the better). A sizable portion of the initiatives I've worked on initially seemed like one-time deployments that would never require revisiting or maintenance. Still, they ended up needing both of those to further the success of the work, or to fix something broken.

In this ever-changing environment of artificially intelligent algorithms and fast-moving advancements, I practice a methodology to allow for initiatives to adapt to this dynamism and have a life ahead of their current scope: the flowing document. To add some background, a semi-technical piece of paper that most technical folks write and run through before they start to build. It consists of all the probable outcomes or avenues that the initiative can technically cover and also outlines the product vision as it passes through a technical feasibility barrier. Usually, these documents are written point-in-time. If there's a feature you're looking to build that moves a particular north star, you've (probably) accurately outlined what you want it to do and how that impacts the ancillary features by fitting in. As quite some of you may have already experienced, a tightly knit document with all use-cases considered is societally considered to be of high standards, whereas one that has even a certain amount of a grey area is frowned upon. In many avenues of black-boxed features and initiatives, what works well is to keep some amounts of grey-area at the end-user levels. If I'm creating a text classifier to break down user comments into distinct categories of complaints, building the end-user layer to view segregated comments is excellent. However, maintaining that grey area in the scope of your initiative allows you to create multiple other enhancements for the user and doesn't stop you from deep diving into a feature that you thought was "built for good". Most importantly, it opens doors to revisiting pseudo-closed avenues and allows room for improvement across the board.

Another benefit to this technique is what I call the mugshot problem-solving technique. In every large picture that someone paints towards an initiative, you will find mugshots worth solving for quickly. Imagining a wide-angle portrait of the landscape, for instance, you will find there are a plethora of micro-elements that come together to form a holistic representation of the scene. Even though these elements come together to form something more prominent, they have independent identities and journeys of their own as they progress within their silos. Similarly, every large-scale vision has pieces of the puzzle that are required to be solved first. These smaller elements, naturally executed faster and with higher focus, allow you to make incremental progress towards your goal that helps you maintain the right direction and fail fast or early if you have to. This even gives you more space in your journey to course-correct if your end-goal was misplaced, allowing for rapid experimentation in the process.

If you've set out to build an initiative that helps you predict attrition, chalk out all the essential pieces that you may need. For starters, you need the right data flowing into your systems from HRMS or activity-tracking systems. Take a few steps further on that, and you'll see you need integrations or other technological aspects to establish the link between your system and those sources. Once you have integrations and your data in place, you may now require privacy controls and exportability tools to help your teams analyze the data and bring insights to drive product strategy. Eventually, you'll address the problem at hand and try to solve it by creating both perspectives: the end-user’s and your own. To wrap that up, you'll create a reporting layer that keeps track of your initiative's health and advises you on your next steps, while having your back covered. As you read through this entire workflow, you'll find that each of these silos or mugshots can have very vivid life cycles of their own. An integration that you may build to support your data infrastructure to predict attrition could be used for a similar task on bucketing employee feedback and drawing demographic comparisons. Similarly, your reporting layer for a particular initiative can act as the backbone for many more to come. It's beneficial to keep the scope of these mugshots free flowing too so that you re-use a lot of your innovation.

Not just restricted to the domain but thinking of any transformative initiative as a long-term one rather than a silver bullet solution will help you achieve sustainable results over time. Keep making smaller and focused deployments, fail quantitatively, iterate, and move up the ladder.

Written By - Chameli Kuduva, Founder - SaaS Insider