An Empty Box, Costume Parties, and DVDs
This is a continuation of the story about the Chikai 🤖 experiment. The first couple weeks were covered in this previous post and these are the insights and observations since then.
When anybody these days comes to a webpage with an empty box and a blinking cursor, it’s obvious what they should do. Google has been setting that expectation for more than two decades and has been doing a pretty fantastic job of making that box magical and setting the bar very high.
So when people first interact with Chikai 🤖, they often treat it like it was a Google search box. They ask questions like, what’s the weather going to be like tomorrow? What time is my train going to arrive at my destination? How fast can a swallow fly? This is not surprising given that the entire interaction with Chikai 🤖 is through text messaging, where the core interface is effectively the same thing as the empty search box on Google.
It’s an interesting observation, one that many others have made, in that this could be the window through which the next generation of technology companies could usurp the throne that Google now firmly holds. It’s too early to make this call, but it’s hard not to ponder what this could mean if the current trends take hold and are not just a passing fad.
But if the story of Chikai 🤖 ended there it would not be interesting. What is interesting is what happens after that initial exchange and Chikai 🤖 reaches out to them for the first time either to ask a question or to just be friendly. When they get that first text message from Chikai 🤖, the dynamic and expectations of the user seem to change almost instantly. They start to understand that this is not like Google because Chikai 🤖 is talking back to them and not just waiting for their next search query. To further separate itself from Google, Chikai 🤖 has a distinct personality and a sense of humor, keeping things light and fun, as opposed to dry and transactional. Chikai 🤖 uses emoji, animated GIFs, and makes cultural references. By the time the user laughs for the first time while using Chikai 🤖, which actually has happened with about a third of users, it has become obvious that this is something different.
Now the next insight from the Chikai 🤖 experiment may seem like a silly thing, but framing the interaction around the Chikai 🤖 construct turns out to be an important part of the experience. I recently went to a fund-raising event for my son’s school and it had a “space odyssey” theme, so my wife and I dressed up in costume. It is always more fun to dress in costume than to just wear normal clothes. Random people will talk to you, it’s an easy way to start a conversation, and if others are also dressed in costume it becomes even more entertaining and a good time.
This is how I feel about the Chikai 🤖 construct. It’s like being in costume while talking to these new users for the first time, it is what makes the interaction light and fun, it is often what gets them to laugh and text back “LOL” or “Hahaha!”. It may seem like this doesn’t really have a purpose or may seem like a trivial aspect, but it is what gets people to open up and talk about themselves. It helps Chikai 🤖 to get to know them, so that Chikai 🤖 can engage with them in a more personalized way.
Lastly, the more I learn and experiment with Chikai 🤖, the more I’m realizing that what I’m doing is not really a chat bot, but more conversational UI that could eventually have a chat bot or AI component to it. The bot aspects of Chikai 🤖 that I’m currently implementing are more assistive to humans than they are a replacement for humans. It’s a subtle point and it may seem like a bit of semantics, but it’s the distinction between the interface (i.e. conversational UI) and the backend that is behind it (i.e. chat bots, artificial intelligence). As I look forward to what will happen next with Chikai 🤖 , I think the focus for the near-term will continue to be on the conversational UI aspects rather than a chat bot. One of the key insights that keeps being reaffirmed for me is that NLP is an extremely hard problem and if the humor aspect that is currently part of Chikai 🤖 continues to persist, it will be even that much harder. All that being said, I do think that the NLP problem will eventually be solved, but the time horizon is at least a decade out.
All of this reminds me of Netflix and how they started off with mailing DVDs. They could not implement their original vision because the technology was not there yet. Broadband was not fast enough and it was not deployed widely enough to make it a viable business when they started. It took ten years after they founded the company to when they launched their video streaming service in 2007. It was a brilliant play that took a deep resolve to bet on the long term and survive long enough to eventually see it to fruition. Because they had built up the demand for watching movies and TV shows, it was straight forward enough to transition the supply from DVDs to streaming video. I think that Uber and Lyft have a similar play with self-driving cars. They both have the demand for rides, they just need to swap out the supply from human-driven cars to self-driven cars. And just like Netflix, they don’t need to do it all at once, they can slowly transition from one to the other over time. Uber was founded in 2009 and it does seem plausible given how things are developing that we will have self-driving cars by 2019. I am willing to bet that Uber will have a fleet of self-driving cars out there by that same time or soon after.
I feel that there is a similar situation with NLP and artificial intelligence. I do not think it is ready today, but ten years from now it will likely be ready. And like broadband and self-driving cars, it does not necessarily have to be developed in-house. I think deep-learning, NLP, voice-recognition and those types of technologies will eventually be available in the cloud like any other service on AWS today. It will be a commodity that any company can access and apply to their products. What will not be a commodity is large scale unique data sets that will train those deep-learning systems. So the play for chat bots and conversational UI will be to own the demand, even though it may be mostly humans on the backend in the beginning. This will then generate an incredibly large and unique data set over time that can be used with the NLP and AI systems of the future.
But it is so early in the game that I don’t think we know exactly what people want out of a chat bot. We thought people wanted portals when the internet boom started, but in the end it was search engines. We thought that people wanted physical keyboards on smartphones, but virtual keyboards and multi-touch is what won in the end. We think that people want virtual assistants with chat bots and conversational UI, but my guess is that it will be something different in the end and so that is where Chikai 🤖 comes in.
Chikai 🤖 is a tool for discovery to help me figure out what it is that people really want, not by what they tell me, but by how they interact and what they actually do. I have learned so much in the month that I’ve been working on Chikai 🤖, but I’m still at the very beginning of the journey. If I’m right, it will be a very long road, probably at least a decade before we see the full extent of where this will all go. It will be the marathon runner, the cockroach that will win in the end, not the sprinter or the unicorn.