Neil Chudleigh, Creator and Founder of SuperWhisper _ AIMinds #014

The Journey of Super Whisper: A Revolutionary Voice-to-Text App

In an era where technology is rapidly evolving and transforming the way we live, work, and interact with each other, it's exciting to witness innovative solutions that can make our lives easier. Neil, the creator of Super Whisper, a revolutionary voice-to-text app, has been at the forefront of this revolution. With his dedication and passion for creating an indispensable tool for people with disabilities, chronic conditions, and productivity enthusiasts alike, Neil has made significant strides in redefining the way we capture our thoughts, ideas, and experiences.

For those who may not be familiar with Super Whisper, it's a simple yet powerful app that allows users to record their thoughts, notes, and interactions using voice commands. Whether you're a student trying to organize your study materials, a lawyer drafting documents, or a therapist documenting patient sessions, Super Whisper has become an indispensable tool for capturing and managing information.

One of the most striking aspects of Super Whisper is its versatility. It can be used by anyone, regardless of their profession or expertise. From medical professionals who need to document patient notes and procedures to therapists who require precise note-taking during sessions, the app has been welcomed with open arms. The app's ability to capture even the slightest details, including codes for different procedures, has made it an essential tool in hospitals and outpatient clinics.

The impact of Super Whisper extends beyond its use cases, however. It has also become a valuable resource for people with disabilities, such as those with dyslexia or repetitive strain injuries. By providing a user-friendly platform for capturing thoughts and ideas, Super Whisper has enabled individuals to organize their lives more efficiently, reducing stress and anxiety.

The app's creator, Neil, has been heartened by the diverse range of users who have adopted Super Whisper. From content creators and writers to software engineers and productivity enthusiasts, everyone can benefit from this innovative tool. Even those who may not initially see the value in voice-to-text technology will find that it enhances their daily lives.

One of the most motivating aspects of Super Whisper is its ability to empower individuals with disabilities. People with conditions such as dementia, Parkinson's disease, or chronic pain have found the app invaluable for capturing memories, organizing thoughts, and managing symptoms. For those who may struggle with traditional writing methods due to physical limitations or cognitive impairments, Super Whisper provides a liberating experience that enables them to express themselves freely.

As Neil has shared his story of creating Super Whisper, it's clear that this journey is far from over. With the app now available on both iOS and Android devices, users can access their recordings anywhere, anytime. The app's creator continues to expand its features and capabilities, ensuring that the platform remains accessible and user-friendly.

The potential applications of Super Whisper are vast, and Neil has already demonstrated its effectiveness in various fields. From hospitals to home offices, therapists' practices to content creators' workshops, this revolutionary voice-to-text app is poised to revolutionize the way we capture and manage information. With its ability to capture thoughts, ideas, and experiences with ease, Super Whisper has become an indispensable tool for anyone seeking to enhance their productivity and communication skills.

In conclusion, Super Whisper represents a significant milestone in the evolution of technology, particularly in the realm of accessibility and user experience. As Neil continues to refine and expand this groundbreaking app, we can expect even more exciting developments on the horizon. By embracing innovation and pushing boundaries, creators like Neil are redefining what's possible and empowering individuals to achieve their full potential.

For those interested in learning more about Super Whisper, Neil invites users to explore his YouTube channel, where he showcases a range of demos and use cases for this remarkable app. With its intuitive interface and user-friendly design, Super Whisper is an essential tool that can be accessed by anyone seeking to streamline their workflow, capture memories, or simply express themselves more easily.

Whether you're a student, lawyer, therapist, content creator, or software engineer, Super Whisper has something to offer. Its revolutionary voice-to-text technology has the potential to transform the way we interact with information and each other, making it an exciting development in our rapidly evolving digital landscape. As Neil continues to shape this innovative platform, one thing is clear: the future of communication and productivity has never looked brighter.

"WEBVTTKind: captionsLanguage: enwelcome back to the AI Minds podcast is a podcast where we explore the companies of Tomorrow built AI first I'm your host Demitri o and this episode is brought to you by Deep Graham the number one text to speech and speech to text API on the internet trusted by the world's top convers ational AI leaders startups and Enterprises like Spotify where you probably listen to some songs twio NASA the one that sends Rockets up into space and City bank today we're talking with Neil the creator of super whisper what's going on dude hey how's it going I'm so excited because your story is one that when we chatted a week ago I thought we have to get you on the podcast I know just for a little background for people you came into the startup program You're Building super whisper and we're going to get into what exactly it is I just would love to start a little bit with some of your learnings over the years because you have such a rich history and so tell us about what you've been up to for the past couple years yeah I I started out actually in the affiliate space building building software and recently built out a side project uh called super whisper to um do really really high quality voice DET text on Mac and iOS um and I really just built it for myself right like I really wanted a high quality experience I a huge user of it uh voice ATT text on my phone um and and I thought you know the the Siri implementation you know sorry Apple but it's it's very very bad uh it gets most wrong it doesn't work in a lot of apps and uh you know seeing all the technology that was available just decided to throw something together and it turns out it was really great like it worked amazingly well uh I could get it to do things that uh you know Apple's dictate just wouldn't wouldn't even come close to so crazy yeah um yeah it just went from there the feedback has been incredible so um I you know I put it out and and and just had you know hundreds of people like gushing saying how much they were loving it and how much they were like converting over to using dictation daily and decided to take like what was really just a side project you know kind of build it for yourself um sort of thing and and take it into be a you know a real commercial product so yeah you scratched your own itch yeah it's it's been you know it's been a huge learning experience too I mean I'm not I I have a lot of of software development experience but this is my first time building something that's so heavily uh that requires so much compute on the user's computer and then also my first time doing like a Mac application that uses this level of their native apis the first time doing like a really heavy audio management uh application so so it's been really fun you know I'm just a state of learning it's uh it's it's really great yeah and I want to get into also the fact that you're now like the solo founder the solo preneur and how that learning I'm sure has compounded too because in your last Venture this for those that are listening this is not your first Venture you kind of glossed over it but maybe we could talk about what you mean by saying you were in the affiliate space especially because we as deep gram we just set up an affiliate program and we are using a tool called partner stack which I think you know quite well yeah yeah so I founded uh partner stack along with three co-founders in 2014 um and uh we went through YC and yeah worked on it for nearly 10 years yeah we basically help software companies find people who love their product and have a significant marketing channel and want to um you know uh co-market with with the products that they love so um you know somebody who who writes about AI might be an affiliate of deep gram and write about what they're what they're able to do or what products are doing with deep gram um or they might have a um you know a a advertising engine or or something along those lines uh maybe a a software review site um maybe they make YouTube videos uh you know the the possibilities are are kind of endless and we basically help track what sales they're driving and then pay them out their commission so um you know it's it's a very different business than what I'm doing now but it is uh a very exciting one I mean the the level of um uh the level of of Partnerships and the and the depth they go to is is quite is quite amazing and and the whole uh the whole landscape of it it's quite fascinating um it it uh really really peels back the the curtains on a lot of the stuff on the internet so it abstracts away lots of those headaches of tracking the Affiliates and and empowering your super users to be able to talk about you and and get a little something for talking about you so if anyone out there is interested in talking about deep gram and wants to become one of our partners in the uh eco EOS system feel free to just go to deep program.com Affiliates and you can find the all the information that you need there but now Neil getting back into the inspiration for super whisper and how you've learned over these last uh x amount of months building you there's something that you said that I want to dig into which is you're pushing a lot of the compute out to user and so presumably that is because most of the application or most of the use cases are happening on the phone so let's start with just like what is super whisper what are some of the use cases that you're seeing people using super whisper for and then maybe you can go into like some of those complex details of pushing out the compute to like the edge devices yeah for sure so the um uh alongside the cloud models that I have uh you know available for use in super whisper um it it's uh you know one of the few use cases I've seen of local AI models uh is voice ATT text and I think there's advantages and disadvantages to each and um you know if you're on a lower power device an older device Cloud models such as deep gam are probably going to be your best bet in terms of getting a fast response but if privacy is really important to you or your connection's not good um local models are quite are are are quite a like promising um uh solution I mean you can run um quite large models these days on on consumer hardware and you know they'll give you quite good results and you know it's completely private it'll run you know totally offline and uh of course it does take up some system some significant system resources so there's there's management along with that and and making sure that it doesn't um uh you know overload your your Mac um and uh and yeah so that's actually where super whisper started out like it was completely offline models uh offline first um and have been integrating more and more uh Cloud models as as time has gone on um but uh yeah it's it it's quite interesting I mean especially when you look at cases of users who um you know perhaps they're handling really sensitive information um you know health related legal uh government um you know just really private information um obviously if you type that into your keyboard uh it's offline if you speak it into your microphone why not have it offline as well so yeah that's uh that's that's what I mean it's been it's been challenging to manage that and I think one of the most difficult things has been building software around the experience of using and installing and using offline models is something there's not really a playbook for that right like this is yeah kind of a fairly new phenomena uh that you've been able to that these things have existed uh that you could run them yourself uh and that something like that would go on a piece of like consumer software so that's been on top of the technical challenge of of actually building software um for it I think the user experience and the UI surrounding those models and explaining it to the user um has been an interesting design Challenge and and something I'm still battling with yeah yeah presumably you're trying not to Brick out the consumer's computer or phone and that can be a fun challenge because like you said you're forging your own path and you may run into questions that don't really have an answer or it's not like it's clearly laid out you probably have to go and troubleshoot and go deep into different communities or Reddit forms or whatever it may be to figure out has anyone else encountered this problem and you probably are finding a lot of obscure GitHub repos that are like oh this is to fix that one little problem and so that you don't break out the computer computers and so let's just get the uh let's get the breakdown what does Super whisper do so super whisper takes your voice uh your recording and it takes the form of a spotlight bar kind of like raycast or uh all the spotlight bar that's built into Mac OS or uh Alfred if you're familiar with any of those um and it'll take your voice and record it you open it up with a keyboard shortcut um say what you need what you're trying to say and it'll take that and translate it into text perfectly um if you want to take that a step further and transform that text into an email or notes or um even I've used it for code um people have used it for all sorts of things it can actually translate um using AI models so you can run run that text your your voice to text results through an AI model immediately after that and you configure it to do that automatically it'll automatically paste that into whatever app you're using so you can imagine you have Gmail open um you're reading someone's email and you just respond to it aloud and have the resulting perfectly formatted uh email with you know say an opening paragraph bullet points uh a couple of highlighted questions a summary paragraph and then a sign off and all of that is done off of of you know a a quarter of the amount of words of of what is actually um presented in the email you can configure this to whatever style you like um you know I have modes where it's extremely informal kind of how I would write instant messages to friends or family uh and then and then stuff that's much more formal such as writing um you know documents Pros notes that kind of thing um and then you know business email personal email that kind of thing so it's it's quite flexible you can do a lot with it and one of the other big use cases is recording meetings so um and and the actual uh meeting recording's been in the in the produ in the product for a while so it'll give you kind of a live transcript of the meeting so whether you're on um you know Google meets or Zoom or a teams call slack um you can record a meeting and have that all all those transcripts end up in the same place and a feature I'll be launching today actually is Speaker separation so it'll be able to identify you know speaker one speaker two speaker 3 and then you'll be able to label uh those those chunks as as a certain person so um you know I think that's interesting a lot of meeting recorders have that utility built in already but the difference with super whisper is you'll be able to have that transcript uh in the same format as you know regardless of which Med uh software you're using and then on top of that you can have that transcript pushed over to the language model and have it summarized do key takeaways and and notes uh again in a consistent format regardless of how you're uh having that conversation software wise so um you know it's it's a big tool there's a lot you can do with it um and I I think it's it's really exciting finding the ways that people are you know My Philosophy to building it is is give people give people like the control over the tool don't I'm not building like a very specific meeting mode I show you how to configure it for meetings uh so give people the power to to do those things give them the power to configure the underlying um tools and provide uh guides on how to do that and I've been surprised with all the different ways that people have found of uh using the tool to be productive in their daily lives so so it's almost like a design choice is flexibility over that opinionated type of view exactly I think I think there's there's ways to always kind of um bring that bring that flexibility and the complexity that comes with it down for people as they're onboarding but I think one of the core things I believe with the app is that the way that you write across all of the software you use in your daily life is more similar than it is different and you kind of want to take that with you and the way that individuals write is so different as well right the way that you write your messages um it's almost like a fingerprint right you can say like oh that you know I don't even need to see uh need to see who's writing it I like oh that was Demetrius uh right like you have a style and I think not giving people that control um is ultimately you know ultimately how you like build something that they're not going to stick with for a long time yeah yeah and there is one thing that I want to call out which I loved from some of the videos that I've seen is that you respond to an email and when you respond it is in like your voice not just because you literally are speaking it and so of course it's your voice but it when the email gets written it's in your voice and then you did something pretty sneaky that I thought was like wait what was that black magic that I just saw and it was saying here if a time works book something on my calendar and then you were you just said something you said some special words and it said like insert calendar link or calendly link something like that I can't remember exactly what it was and it hyperlink your calendly link to that inside of Gmail and everything and so I thought that was pretty fancy yeah yeah so there's there's lots of uh lots of features like that in super whisper um that one in particular is uh if you're familiar with to like text Wrangler um it's basically just looking for uh keywords in in your transcript and then replacing those with something that you've set up so you know I've set them up for um you know all my social media account uh links all of like my calendar link uh my email address uh stuff like that so you can very quickly you know make sure that it goes in perfectly because sometimes voice DET text you know it's not going to get a URL exactly perfect if you dictate and dictating it is kind of cumbersome so painful yeah nobody's going to be sitting there saying like Okay linkedin.com 8569 yeah I mean you're not going to remember it yeah there's all those things it might it might get a few characters wrong one of the other things that's worth pointing out is different than the dictation tools that people might be used to that you'd find on you know Android or or iOS um you don't have to dictate punctuation uh super whisper is going to pick it up off of the pauses and intonation in your voice um and even if you take a long pause in the middle of a sentence it's going to take the context of that sentence and the words that it's found and decide okay is that two sentences or is that one and it's going to uh join them or separate them accordingly um it's it's trained off of off of video uh and subtitles as well as recordings of people dictating so it it really does like the models have they have an understanding of when you know when like your your intent as a speaker um so it's it's much less I I know a lot of people who are like oh I've tried dictation in the past and I just find it it's too it's too nitpicky it's too fiddly uh I can't I can't get out what I'm wanting to say faster and I'm having to go back and edit and I find if I can convince those people to try it they often Come Away uh uh very excited about about the improvement over what they're used to so well because what happens to me a lot is that I'll be looking so I'll be speaking and dictating and then reading what is coming out and because what comes out is incorrect it throws off my dictation and then I have to stop dictating and then go and manually type in and correct something or whatever it may be and so it totally is that frustrating moment of like oh where was I or ah you know and then start back up the dictation and go for it so so you're trying to avoid that I'm guessing yeah yeah 100% um I think so there is a like I said there is a real- Time mode that kind of gives you sort of a transcript um of of what's being said typically you're using those for if you're taking you know notes on a video or you're in a meeting um but yeah a lot of people prefer to once they get into the flow of using it and Trust the tool um you know they prefer to have that off when they're when they're just writing because not seeing what you're saying or how the computer's interpreting it until you're done frees up your brain to continue thinking about whatever you're talking about so 100% f focus on on your message and what you're what you're trying to write um and I think that's important especially when you have a language model kind of cleaning up the uh punctuation grammar sentence structure like reorganizing the ideas and just making sure that you know you're not you're not so constrained to make sure each word placement is perfect as long as the general ideas got it like gets across um you can even do mids sentence Corrections right if you say something that's not quite right and then say Oh wait sorry no I meant 3 P.M not 4 um you can set up the language model to go back and take that 4 P.M piece of information and replace the the three uh or or it'll know that that is the that is the correct piece of information so um it it really does release you from uh the the like kind of nitpicky nature of you know traditional voice to text and how funny is that that the user experience is actually better not seeing what you're saying because it makes complete sense to me when we're talking right now there is no text that is coming up showing us what we are saying in this conversation I think it would be way more distracting if I had subtitles that were happening in real time as we're talking and I wouldn't be able to follow the conversation or give you my full attention so it makes a whole lot of sense that you wouldn't necessarily want that uh especially not 100% of the time yeah I mean it's it's going to be up to user preference and I think a lot of people initially are maybe uncomfortable with it um so there is the option there um I think there is some utility in it as well being able to go back and reference say it's a many person meeting maybe you get pulled over to something else being able to scroll back up and read through the transcript to catch back up you know that's utility and it's in the tools so I mean you can choose but yeah for me personally when if I'm if I'm writing something I don't I don't want to see I don't want to see the the immediate feedback I find it it's too much well let's talk about the writing because I think that is a huge use case and I'm seeing more and more people talk about how they are writing their blog post by dictating I personally have the hardest time continuing my train of thought when I dictate and so I feel like it's great to get almost like a first pass and try and get everything out there really quickly but at the same time I feel like I get to a certain place and then I I forget where I wanted to go or my words get ahead of me and so maybe it's more of an art form and if you practice it you get better at it but I would love to hear like how you've been using it for writing I have the appu in a way where you have access to all the features even every single paid feature um for 15 minutes of recordings and then it and then it bumps you back to like the fremium uh or or the free tier uh which is still a pretty good tool there's tons of people who just use the free tier option and continue on with it and it's great um I think it misses a lot of the huge like power advantages with the with the tool but I think if you sit down with it for 15 minutes and kind of break through break through like your um maybe your your past experiences with dictation and kind of the um you know the the first uh the first passes and and build that trust with it too that it's going to capture everything um I I think that that's great um and and like a lot of people they'll convert at that point um they'll they'll kind of you know typing will start to feel slow uh and and they'll they'll want to be dictating everything so the the process of writing with it I mean the tool is not meant to be used as one big recording and then give you a whole document necessarily it can it can do that I think it's best used in conjunction with um you know pop open word or notes or you know whatever you're writing with especially if you have an idea for something just just get it out there on the on the page uh and and go you know paragraph by paragraph or two paragraphs at a time you can actually with the language model you can get it to um restructure your sentences if you want to do things like uh simplify my language or you know alternate through alternate um you know the vocabulary or like the the hard-hitting um you know you know hard-hitting sentences and softer more uh comfortable sentences uh to kind of create interesting writing um those those sorts of uh Transformations on on what you've said is possible so I think what it what it can do for a writer is is quite powerful and and something that um should be able to free them up to not so not think so much about the mechanics of of what they're writing um and more so on the idea um of what they're trying to get across uh so yeah I mean I I would just say it does take it does take a bit of effort you know the first time you sat down to a keyboard uh yeah were you instantly comfortable typing um my guess is no uh yeah so like it is it is learning to use a new tool actually and now that you say that it's funny because I do see it as something it's maybe the the way that I was trying to do it in the past wasn't necessarily the optimal way and I like this idea of hey maybe you're going through a document like I'm reading a paper about AI for example and I want to give my thoughts on that paper I want to highlight some of the key sentences or key points in my own words as opposed to just like flipping it and highlighting it and so I have it on right I have super whisper with me and it it is on listening while I am scanning this document but it is taking notes as I'm speaking them oh wow isn't this interesting they're they're doing this or they're trying that okay they're using these words or they're using this formula definitely not going to try and speak a formula into uh transcription yet I don't think that's going to be there at this point in time I don't even know so a lot of the different letters that they use yeah like I could get I could get lost I could get very lost but the um but yeah like having it there is just again it's hanging out listening to me and taking down everything that I have as opposed to what I was doing in the past where it was like okay now it's time to take all of my ideas that I have in my head and get them onto a piece of paper and so I have to have them formulated and I have to know exactly how I want them to come out and structure and the organization it's a it's a different way of using it and a different way of thinking about how it can be used yeah yeah 100% I think um you know I again that flexibility uh exists to fill the Gap like I want people to understand what the UT like what the utilities are under the hood and and to apply them uh to their to their daily lives and in ways that they find useful I I don't want to be um I want to show some examples but not be prescriptive uh and allow them to you know cuz the workflow for a uh a lawyer and a student taking notes are very different uh and and the way that even between two lawyers the way that they want to interact with voice probably pretty different um you know they're different people different experiences maybe different skill sets maybe one of them's a lifetime dictator and has a workflow that that they like and the other ones never use dictation before and it you know so there's got to be that flexibility to adapt to um the workflow that they're used to uh but hopefully elevate it and then and then you know uh something different for for someone who's who's just entering um you know building it into their daily workflow so it's so funny you say a lawyer too because my dad is a lawyer and I remember him when I was growing up going into his office and he had one of those recorders and he would sit there and talk into it and it would be very short Snippets right like this blah blah blah and he had his style and his workflow and then later he would get the whoever it was to type it up so so he would be recording it and then he would get some um assistant to type it up and now all of that does not need to happen like he doesn't have to have somebody type it up it can be in real time he can see it as he's doing it or right after he's done it he can get a summary of all of it or maybe suggestions on where to make it better and so I think that is super cool I imagine you've seen a few other use cases like it sounds like for a lawyer that's a no-brainer yeah also a student another no-brainer like what are some of the other ones that people have come to you with yeah um basically everybody in the medical Fields uh and I think a a lot of people go to traditional medic like you know uh in hospital but um even more heavily outside of that outpatient clinics take a an absolute mountain of notes every day and are their requirements to take notes are actually um quite High uh because of the filing process that they have afterwards um to uh interact with insurance providers um so they're having to document um you know codes for different procedures uh and they have to take very rigorous um notes of every single session because if they don't and they get audited they're in trouble so yeah um so yeah a ton of you know uh a ton of different applications there um right like and not even necessarily like physical uh Health but you know psych like psychology uh therapists um that sort of thing um tons of productivity uh enthusiasts um so um you know people who are uh content creators writers um a lot of people who are in like really into personal Knowledge Management a lot of software Engineers yeah I could see it being fascinating for someone who just picks up their phone in the morning and speaks into it as like a diary in a way and then you can have it all there and have it be something that's just able to capture your thoughts like I I wish I could do it like every morning when I woke up and just capture how I'm feeling what I'm getting ready for in the day and almost document my life in that regard yeah yeah a lot of people are um more so than I guess waking up in the morning but uh with the iOS application I'm getting the feedback that a lot of people are taking it on a walk um so start a recording and then um you know just and just record thoughts as they come up on their walk um which is which is quite nice I like I you know I really like the idea of just taking that time to you know have have nothing in front of you and and just be talking one of the other the most motivating actually um now that I'm thinking about it is uh uh those with disability um both both permanent and temporary um so you know I've gotten you know people who have installed it on their on their you know parents laptop uh if their parent has you know dementia um I have people with dyslexia telling me that it's helping them with organizing their thoughts um you know sometimes writing can be overwhelming um people with repetitive strain injuries um or or if they're I had a one guy who's getting back surgery so he was in bed for 3 weeks um and so he had his laptop propped up and he's like you know has to be in this one position can't move his back basically and uh and and was using dictation for that whole time I had one of my one of my close friends actually broke his hand uh opening a jar of pickles uh of all things snapped snapped uh yeah like one of the bones of his hand I know oh my God crazy and uh and and you know he became a user of super whisper like immediately after that so um yeah it's been it's been it's been awesome hearing about like how it's helping people um and that continues to expand as as like the product is is you know uh coming to the phone and uh and people are getting their hands on it there so excellent well Neil this has been fascinating man I really appreciate you coming on here and talking about your journey and talking about super whisper anything else you want to mention before we jump I know there's all kinds of cool stuff that you're doing and so people can get involved by going to Super whisper Googling it um but then you've also got really cool use cases and demos on your YouTube channel so I encourage if anyone is at all curious go check out super whisper on YouTube and you'll see all the different ways that Neil is talking about right here yeah yeah it's super whisper.com uh super Whisper app on Twitter um yeah that's pretty much it yeah I would really recommend checking out the videos they're uh they're the easiest way to really understand what you can do with it so yep yep as it was for me that was what was the mind-blowing piece when I saw you put in that Cy link I was just like whoa play that back did I see that right so I love it man well this has been great and I super appreciate you being part of the deep grham startup ecosystem it's really cool to see and uh wish you continued success with everything you're doing with super whisper all right yeah thanks man take care\n"