In this article, we're going to discuss some amazing things you can do with ChatGPT Plus and OpenAI's Code Interpreter add-on. But first, we need to discuss the giant purple elephant that's about to blink into the room.
What is that giant purple elephant, you ask? Data security. Specifically, we need to discuss your (and, in this case, my) proprietary data. Here's the thing. For ChatGPT Plus to be able to mine your data, it has to have access to it.
See where I'm going here? To do everything I'm about to tell you about, I had to upload a 22,797 record data set exported from my company's servers. What will OpenAI and ChatGPT do with that data? I have no idea. That's a big risk.
In my case, it's more important to share the process of data analysis with you than safeguard my data. But that's my decision to make. It's my data. I know that I won't be violating any disclosure agreements, or putting my company at risk by sharing it with ChatGPT (and, by extension due to this article, with you).
But if you use these techniques -- and make no mistake, they are gobsmackingly powerful -- you'll need to decide whether you and your company can comfortably share that data with an AI, and possibly, the rest of the entire internet.
There is a possible way to disinvite the elephant. OpenAI has introduced a new tier of purchasable ChatGPT service: ChatGPT Enterprise. This service tier solves many of the concerns I listed above. Specifically, "Customer prompts and company data are not used for training OpenAI models." It's also providing data encryption for both data in transit and at rest.
This would allow you to more safely upload data like the example I'll be showing, without concern that your proprietary data will get loose in the wild. The catch? Pricing data hasn't been disclosed. OpenAI is using the dreaded "a salesperson will call" as a substitute for a published price. Most likely, any ChatGPT Enterprise service will be priced out of the range of smaller companies. That said, OpenAI has also promised, "Availability for all team sizes: a self-serve ChatGPT Business offering for smaller teams."
So, there's that. No details on when that will happen, or its price, but the company does say, "We'll launch them as soon as they're ready."
And with that, let me show you why this is exciting.
What are we looking at?
The data set I'm using is uninstall data, gathered when users uninstall my WordPress plugins. Here's how that works.
When a user chooses to uninstall either Seamless Donations or My Private Site, they're presented with the above dialog. Data from each of those uninstalls is sent to my server, where it's stored.
Up until now, I've been able to see the data represented in tabular form, like this:
But that's about as good as it got. I never had the time to build any detailed analytics to chart or create pivot tables. So I could thumb back a few pages and get a rough feel for what was happening with recent uninstalls, but I had no thousand-foot view with which to derive overall insights.
Preparing ChatGPT for your file upload
You'll need ChatGPT Plus, which is the version of ChatGPT available via a $20/month subscription.
You'll also need to go to your ChatGPT settings, and switch on Code Interpreter from the Beta Features tab:
And, finally, when you begin a session, you'll need to select GPT-4 and Code Interpreter. If you do all that, you're set.
The next thing you'll need to do is upload your data. By this point, I'm assuming you and your management team have thought through the giant purple elephant implications (okay, now I'm just doing it for the lulz), and you're okay with uploading data to Skynet. If so, here goes.
Click the plus sign at the bottom of your session screen:
Click Upload to upload your file. When you're done, hit return.
Once that was done, ChatGPT showed me how many records were in the file. To be sure it was able to read what I uploaded, I asked it to describe the fields.
Let's make data analytics magic together
When using Code Interpreter, ChatGPT is…chatty. It's like that enthusiastic geek friend who can't get to the point and has to share everything about how they got to the answer, before giving you an answer -- or like that article writer who takes a few thousand words to give you essential backstory before finally getting to the few key "how to" instructions.
Because ChatGPT is so chatty, I'm going to show you screenshots of its answers. I'm going to cut out all the extended information provided before and after the answers. Otherwise, these screenshots would be a mile long.
And with that, I asked a simple question and got a clear answer.
How many records are there for each product?
To be fair, creating that calculation wouldn't be hard to code, but it would be time-consuming. ChatGPT? 15 seconds, on the fly. Boom.
What percentage of records contain comments?
Most users don't leave comments, and those that do are those who chose to select "Other" rather than one of the pre-defined uninstall reasons. Even so, check out what two simple questions were able to extract from all that raw data.
Examine all relevant comments and conduct a thematic analysis to identify common trends and patterns
For each product, describe the prevalent functionality issues described in the comments.
Based on what I know of my users, that analysis is pretty much spot-on. But more to the point, wow! I mean, this thing chugged through 22,797 records and presented overall issues. And it did it in less than a minute. Do you have any idea how long that would have taken to tabulate by hand or to code? Days.
To be fair, ChatGPT didn't just generate the most helpful answer right away. I had to negotiate with it, trying a bunch of different prompts until I found the ones that worked. But even so, that process took less than an hour vs. days.
Want some pie?
Next, I decided to see if I could get some charts. The uninstall reasons come in a set of pre-defined categories, so I wanted to see how they compared. I also wanted to see if the uninstall reasons changed over the years. I fed the AI this prompt:
For each product and then for each year, draw a pie chart of uninstall reason codes. Do not include other, nan, and temporary-deactivation. At the end, note any trends or insights observed.
I actually got back the eight pie charts I expected, but I'm only showing one here. Of particular note is that my data was recorded in 2020, 2021, 2022, and 2023. So why did ChatGPT talk about 2017 and 2018?
The charts were drawn for the correct years, and the data it showed makes sense. I first started using My Private Site because I wanted to block a test site I created for grad school from everyone but me and my professors. Once I graduated, I no longer needed the plugin for that purpose. A lot of people probably download it, and use it on a project basis.
The AI also generated some conclusions derived from the data.
The product-specific patterns it identified were fascinating. This is a large language model that theoretically knows nothing of my software apps. Yet its analysis was absolutely spot on. Those two patterns are directly reflective of what I've seen in managing those products.
They don't hate it. They really don't hate it.
Back in February, I shipped a major change in how Seamless Donations handles payment gateways. That version, 5.2, has worried me ever since. I haven't had a lot of user feedback, so it's been hard to tell if users liked it, or hated it, or if it caused them to abandon the product. Usually, when users dislike an upgrade, they're very vocal. But this was huge, and you could hear crickets.
One of the fields in the uninstall data set is for the version number. So I had ChatGPT do some sentiment analysis to see if users who uninstalled from 5.2 onward were doing so because of something new. Let's look at what the AI was able to tell me.
Comparing all data (including whatever comments are available), do users seem more or less satisfied with Seamless Donations from 5.2 onward? Provide details and insights.
Here's what I got back:
Take a moment to appreciate this. I wrote two sentences and the AI looked through 22,797 records and performed a very detailed analysis, all to conclude that users seemed to have a "slight increase in positive sentiment" in the new release.
If I'd had to write the code to do the amount of work the AI did, to process the amount of data involved, it would have taken forever. The level of effort in terms of programming I would have had to do to get this information would have been off the charts. Instead, all I had to do was write two prompts.
Sure, if I were a product manager for IBM, I might have been able to bring Watson into the picture and use data-crunching teams to create a product analysis. But as one guy, writing two sentences, and getting insights as valuable as this -- just wow!
I am blown away.
This is a real tool
There is no doubt room for concern about uploading corporate data to ChatGPT Plus. But for data where such concern doesn't exist (like my data set), this is no longer a novelty. It's not just a fun parlor trick.
This is a real productivity tool. This is something we can use to get real work done, that accomplishes something we might not otherwise be able to do, and it does it well. Sure, there's always the concern that the results are wrong, but that's also a fair concern if someone had written a custom program to generate this information.
I paid twenty bucks and did all of this analysis in the space of a few hours (I was kicked off after having asked too many questions and had to come back a few hours later). The amount of work it would have taken and the expense it would have cost to get the insights I got from my sessions with ChatGPT are almost incalculable by comparison.
This is real, folks. Add it to your toolbox alongside your other powerful productivity tools. And try not to think about purple elephants.
Do you have data you feel safe sharing with ChatGPT? Do you have data where you really want it to provide you with some answers? Have you used ChatGPT in this way before? Discuss with us in the comments below.