June 24, 2019

What’s the Good Word? – Say Hello to Text Analytics

by Vinod Cheriyan
in Advanced Analytics

From it’s start in 2004, Enova has always been a pioneer in Analytics in the FinTech space. We have a 60+ person analytics team that works with a 200+ person SE team. Analytics forms the backbone of our business. We use analytical models in every touchpoint of our customers’ journey – right from product design, marketing, fraud check, decisioning to operations and customer support. We use advanced analytics algorithms including graph algorithms. Two years ago, we started our Analytics-as-a-service offering Enova Decisions.

But one area we hadn’t explored until recently is text analytics. Text analytics is not at all new — there are several tech companies using it now successfully. For example, Uber uses text analytics to propose automatic responses to customer messages in the Uber app so that the driver can quickly respond without getting too distracted from the road.

However, many medium scale companies are hesitant to get into text analytics because, as with any new technology, there are inherent costs to investing in it. It can be resource intensive and requires some upfront investment. If you have been sitting on the fence about implementing text analytics at your company, this article is for you!

At Enova, we believe the time is ripe to take the bold step into exploring and implementing text analytics. Firstly, stories of success abound all around, so you would not be the first one to do it. Not only that, packages and technologies are easily available. Python’s NLTK package is a great example. R also has packages for text analysis. It is easy to include these packages as part of your model building workflow and you can easily get a head-start.

So how does one get their feet wet? Enova is currently taking this step and I wanted to share some of the initial insights and learnings.

How to Get Started

1. Understand the capabilities

Start with understanding the capabilities of text analytics and talking to your stakeholders. Text analytics approaches can range from basic rule-based keyword searches to analyzing sentiments to full-blown natural language processing. There are a number of good books and articles on this topic. For example, Text Analytics With Python is a good starting point.

2. Understand the use cases

Text Analytics can be used for some pretty cool use cases. For example, see SAS blog article on five remarkable use cases. However, none of these matter if you are not actually helping your customer or moving the needle for your business. As with any initiative, this is best done through talking with your customers and your business stakeholders. Not only will they tell you what are their pain points, including them early on in the discussion will also get them to support you during implementation and change management.

3. Build a Proof-of-Concept

Plan to spend some focused and dedicated time on doing a proof-of-concept. At Enova, there is a Fellowship Program where people are encouraged to submit innovative ideas. If selected, the team gets to work on the idea for four weeks, without having to worry about their day jobs. This is where my coworker Madhuri Gupta and I pitched an application of text analytics.

For one of our leading brands, in certain situations, we propose repayment plans to some of our customers via email. If the customer accepts the proposal, we go ahead and update the customer’s case and schedule those new payments. In the current system, the customer’s response email goes to a call center representative’s queue to be processed. We wanted to see if we could reliably use text analytics to parse these emails and understand automatically if the customer accepted the proposed plan or not.

We were able to show the business that there is definitely value in applying text analytics. We compared a number of models and approaches. The base case of using some limited dictionary-based lookup did not provide much lift. Next, we used the ‘vader’ python package that assigns sentiment scores. This gave us a slight improvement. When we fit a custom xgboost model, that gave the best results. We were able to correctly identify about 80% of the incoming emails.

Challenges you might face

In addition to giving the business confidence in the value of custom text analytics models, the proof-of-concept also brought to light areas where up-front investment was needed to implement text analytics. As with any model building exercise, a big challenge was data-pull and data transformation – but this was expected. However, there was a lot of time and effort spent in the initial labeling of the emails. In our current workflow, once the representative reads and understands the customer’s emails, they just take the action – schedule the new payment or send a response with an updated proposal. While we can observe these activities, the meaning that the representative understood from the email is nowhere persisted.

This meant that we had to come up with a way to label these emails. Given the short time frame, manually labeling the training and test sets were not feasible. So we decided to investigate proxies. We had two possible candidates – if a payment was scheduled close to the customer’s email, it is reasonable to assume that the customer must have accepted the proposal. Also, the presence of certain keywords in the notes that the rep enters in the customer’s case can also be an indication that the customer accepted. To evaluate the effectiveness of these proxy labels, we manually labelled a subset (about 200 emails) and compared them with effectiveness of the proxy. It turned out that the first one – whether a payment was scheduled – was a better proxy and we used that to label the entire dataset.

Data cleansing and labeling is an important area where some up-front investment is needed in implementing text analytics. In the initial phases, this can be done using in-house resources (e.g. customer service representatives) or external resources like Amazon Mechanical Turk. (Of course, one needs to consider the sensitivity of the data when using an external service.) There are also third-party software available to help with automatic labeling of datasets. However you need to take into consideration their actual capabilities and the quality of the labeling for your business.

For a more longer-term program, you may want to consider investing in systems where text data is cleaned and tagged and made available for building analytical models. You can also redesign your business processes and applications so that text analytics is not an afterthought. For example, are you persisting all the details from a transaction that are needed for text analytics?

One thing to watch out for

When building customer facing text analytics applications like chatbots, it is important to not to fake and be a human. How many times do people try to reach an operator or a human assistant on an IVR menu? The goal should be to help the customer fast, if possible. This means being upfront with the customer and letting them know that they are interacting with an automated system. It also means holding yourself to a higher customer support standard. For example, can you answer the customer’s question in less than two attempts?

I am excited about the adventure we are on at Enova in realizing the full potential of text analytics to our business. And I encourage you to take that first step too!

About

How to Get Started

Challenges you might face

One thing to watch out for