ChatGPT can transcribe audio, but the answer depends on what you mean by "ChatGPT." ChatGPT Record Mode can record, transcribe, and summarize meetings or voice notes, but it is currently a macOS desktop app feature for Plus, Pro, Business, Enterprise, and Edu workspaces, with a 4-hour / 240-minute session limit. The OpenAI Speech-to-Text API can transcribe uploaded audio files using models such as gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-transcribe-diarize, and whisper-1, but API audio uploads are currently limited to 25 MB and require cloud processing.
The important catch: ChatGPT transcription is not offline. Record Mode, Voice Mode, and the OpenAI API all require audio or transcript data to be processed through OpenAI's cloud. If your search intent is really "transcribe audio on my iPhone without uploading it", VoiceScriber is the better fit: an offline iOS transcription app built for private voice notes, interviews, meetings, and sensitive recordings that should stay on-device.
TL;DR
Can ChatGPT transcribe audio? Yes — but not offline. In 2026, the main OpenAI options are ChatGPT Record Mode for live recording on the macOS desktop app and the OpenAI Speech-to-Text API for audio files. Record Mode is capped at 4 hours / 240 minutes per session. The Audio API supports models including gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-transcribe-diarize, and whisper-1, with a current 25 MB audio file upload limit.
For privacy, the details matter. OpenAI says Record Mode audio is deleted after transcription, but the generated transcript/canvas follows workspace retention settings. OpenAI's API data-control table currently lists /v1/audio/transcriptions and /v1/audio/translations as not used for training, with no abuse-monitoring retention and no application-state retention. Still, both options are cloud-based. If your requirement is "no recording leaves my iPhone," use VoiceScriber, an offline iPhone transcription app for private, on-device voice-to-text.
ChatGPT vs VoiceScriber vs OpenAI API: quick comparison
| Feature | ChatGPT Record Mode | VoiceScriber iOS app | OpenAI Speech-to-Text API |
|---|---|---|---|
| Best for | Meetings, brainstorms, voice notes, instant summaries | Private iPhone transcription, offline notes, interviews, lectures | Developer workflows, audio-file transcription, integrations |
| Works offline? | No | Yes | No |
| Platform | ChatGPT macOS desktop app | iPhone / iOS | Any app or backend using the API |
| Current main limit | 4 hours / 240 minutes per session | Device/app-bound, no cloud upload cap | 25 MB audio upload limit |
| Audio files | Record live audio in the app | Record/transcribe on-device | Upload supported audio formats such as MP3, M4A, WAV, WEBM |
| Speaker labels | Multiple-speaker support | Best for private notes/interviews | gpt-4o-transcribe-diarize supports speaker-aware diarized_json |
| Where processing happens | OpenAI cloud | On your iPhone | OpenAI cloud |
| Privacy fit | Cloud-friendly users who want AI summaries | Privacy-first users who want no upload | Developers needing endpoint-specific cloud controls |
| Best high-intent query | "ChatGPT meeting transcription" | "offline transcription app iPhone" | "OpenAI API transcribe audio file" |
Table of contents
- ChatGPT vs VoiceScriber vs OpenAI API: quick comparison
- Can ChatGPT transcribe audio?
- Record Mode vs API vs Realtime: what's the difference?
- What data does OpenAI delete, retain, or use for training?
- What changed in 2026?
- ChatGPT transcription limits in 2026
- ChatGPT transcription: key pros
- Where ChatGPT transcription breaks down
- Who should not use ChatGPT transcription?
- The private offline alternative: VoiceScriber
- At-a-glance comparison table
- If you must use ChatGPT, harden your setup
- How we tested
- Related articles
- FAQs
1. Can ChatGPT transcribe audio?
Yes. In 2026, ChatGPT/OpenAI transcription falls into three different buckets:
- ChatGPT Record Mode — records, transcribes, and summarizes live audio such as meetings, brainstorms, interviews, and voice notes. It is currently available only in the ChatGPT macOS desktop app for Plus, Pro, Business, Enterprise, and Edu workspaces.
- OpenAI Speech-to-Text API — a developer API for transcribing uploaded audio files or translating audio into English. It supports
gpt-4o-transcribe,gpt-4o-mini-transcribe,gpt-4o-transcribe-diarize, andwhisper-1. - Realtime transcription — an API path for live audio streams from a microphone, call, or media stream. OpenAI recommends the Realtime transcription guide for ongoing audio rather than the file-oriented transcription path.
But none of these are offline iPhone transcription. If you need to transcribe a private recording on your iPhone without uploading it, ChatGPT is the wrong tool. Use an offline iOS app like VoiceScriber instead.
2. Record Mode vs OpenAI API vs Realtime: what's the difference?
| Feature | Record Mode | Speech-to-Text API | Realtime API |
|---|---|---|---|
| Best for | Meetings, voice notes, brainstorms, summaries | Uploading audio files for transcription | Streaming voice apps and low-latency audio |
| Platform | macOS desktop app only | Any platform via API | Any platform via WebRTC/WebSocket |
| Access | Plus, Pro, Business, Enterprise, Edu | API key with billing | API key with billing |
| Current models | Internal / not user-selectable | gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-transcribe-diarize, whisper-1 |
Realtime-specific models and transcription settings |
| Speaker diarization | Supports multiple speakers | Use gpt-4o-transcribe-diarize with diarized_json |
Depends on implementation |
| Main limit | 4 hours / 240 minutes per session | 25 MB per file upload | Depends on realtime session, model, and usage limits |
| Privacy posture | Audio deleted after transcription; transcript/canvas follows workspace retention and training settings | Audio transcription/translation endpoints currently show no training, no abuse-monitoring retention, and no application-state retention | Cloud-based; check API data controls and retention settings |
| Offline? | No | No | No |
Bottom line: ChatGPT is strong for cloud AI workflows. VoiceScriber is better when the buying intent is offline transcription app for iPhone, private voice-to-text, no cloud transcription, or airplane mode transcription.
3. What data does OpenAI delete, retain, or use for training?
This is the most important privacy question — and the answer depends on which OpenAI product you use.
Record Mode (ChatGPT macOS app)
- Audio recordings are used only for transcription and deleted afterward.
- Transcripts and canvases follow your workspace retention settings. If you delete the conversation, the associated transcript/canvas is removed from OpenAI systems within 30 days, unless OpenAI is legally required to retain it.
- Reference record history: When enabled, ChatGPT can reference past recording canvas and transcripts in future conversations. This can be useful for meeting memory, but it is a privacy consideration for sensitive topics.
- Consumer ChatGPT: If model improvement is enabled, transcripts and canvases may be used to improve OpenAI models unless you opt out.
- Business, Enterprise, and Edu: Record Mode transcripts and canvases are excluded from model training by default.
Speech-to-Text API
- The
/v1/audio/transcriptionsand/v1/audio/translationsendpoints currently show no training, no abuse-monitoring retention, and no application-state retention. - OpenAI says API inputs and outputs are not used for training by default unless an organization explicitly opts in.
- Other API endpoints can have different retention behavior, so do not assume every OpenAI endpoint has the same privacy profile as the audio transcription endpoints.
Legal holds and litigation
Normal deletion timelines can be overridden by legal obligations. OpenAI previously said a 2025 legal order requiring indefinite retention of certain consumer ChatGPT and API content ended on September 26, 2025, but it also describes ongoing NYT litigation demands involving a de-identified sample of consumer ChatGPT conversations from December 2022 to November 2024. The practical takeaway is simple: cloud retention rules can change when legal process is involved. For highly sensitive recordings, the lower-risk path is to avoid uploading the audio in the first place.
Key takeaway for privacy-first users: OpenAI's API audio endpoints currently have the cleanest cloud-retention story, but they are still cloud endpoints. If your requirement is "no upload, no server, no third-party retention risk", use an on-device iPhone app like VoiceScriber instead.
4. What changed in 2026?
The main 2026 update is clarity: ChatGPT can transcribe audio, but each route has a different limit and privacy profile.
- Record Mode is now a 4-hour / 240-minute feature. Older 120-minute references are outdated.
- Record Mode is still macOS-only. It is not an offline iPhone transcription app.
- Audio files are best handled through the API, not normal ChatGPT document uploads. OpenAI's ChatGPT file-upload docs focus on documents, spreadsheets, presentations, images, PDFs, and text files, while OpenAI's audio-file transcription docs point to the Speech-to-Text API with a 25 MB audio upload limit.
- Speaker-aware API transcription is stronger.
gpt-4o-transcribe-diarizesupports speaker-labeleddiarized_json, useful for meetings, interviews, and multi-speaker recordings. - Privacy settings are more important. "Reference record history" can let ChatGPT use past recording notes and transcripts in future conversations when enabled.
- On-device transcription keeps improving. Recent edge-ASR research shows CPU-only, faster-than-real-time on-device speech recognition is becoming more practical, and Apple continues to push privacy-oriented on-device AI that can work offline.
Bottom line: ChatGPT is improving for cloud transcription and AI summaries. VoiceScriber is still the better answer for users searching for offline iPhone transcription, private voice-to-text, no cloud transcription, or airplane mode transcription.
5. ChatGPT transcription limits in 2026
| Limit | ChatGPT Record Mode | ChatGPT file uploads | OpenAI Speech-to-Text API | VoiceScriber |
|---|---|---|---|---|
| Best use case | Live meetings, voice notes, summaries | Documents, PDFs, spreadsheets, presentations, images | Audio-file transcription and developer workflows | Offline iPhone transcription |
| Audio support | Live recording in macOS app | Public docs describe document/file workflows, not the main audio transcription route | Audio files such as MP3, M4A, WAV, WEBM | Record and transcribe on-device |
| Main limit | 4 hours / 240 minutes per session | 512 MB per uploaded ChatGPT file; document/text limits also apply | 25 MB audio upload limit | No cloud upload limit |
| Internet required? | Yes | Yes | Yes | No |
| Platform | macOS desktop app | Web, iOS/Android apps, supported plans | API | iPhone / iOS |
| Privacy posture | Audio deleted after transcription; transcript/canvas follows workspace retention | Files are saved to your Library/account unless deleted or excluded by product behavior | Audio transcription/translation endpoints currently show no training, no abuse-monitoring retention, no application-state retention | Audio and transcripts stay on-device unless exported/shared |
This distinction matters because searchers often mix up ChatGPT file upload limits with OpenAI audio transcription upload limits. The ChatGPT document upload limit is not the same thing as the Audio API's 25 MB transcription limit.
For files larger than 25 MB, OpenAI recommends splitting audio into smaller chunks or using compressed formats. If you want a tool that does not depend on API upload limits, Wi‑Fi, or server-side processing, see our airplane-mode test of 7 popular transcription tools.
6. ChatGPT transcription: key pros
- End-to-end convenience: Record, transcribe, and auto-summarize in one place; outputs can be turned into tasks, emails, or plans.
- Speaker diarization: Record Mode supports multiple speakers; the API offers
gpt-4o-transcribe-diarizefor speaker identification. - Enterprise-friendly controls: Business/Enterprise/Edu content is excluded from training by default; admins can control or disable Record Mode.
- API audio endpoints have no retention: The transcription and translation endpoints currently retain no data for training, abuse monitoring, or application state.
- Multilingual: The speech-to-text models support a wide range of languages, though accuracy varies.
7. Where ChatGPT transcription breaks down
- It is not offline: Record Mode and the API require internet. If you need to transcribe on a flight, in a hospital dead zone, in court, or during field interviews without signal, ChatGPT is the wrong fit.
- It is not iPhone-native transcription: Record Mode is macOS-only. iPhone users who want tap-to-record offline transcription need a separate iOS app.
- Record history can create context bleed: When "reference record history" is enabled, past transcripts can influence future responses. That is useful for continuity but risky for sensitive meetings.
- Consumer training settings matter: Consumer ChatGPT content may be used for model improvement unless you opt out. Many casual users never review this setting.
- Legal holds can override deletion promises: Even when a product has normal deletion windows, active litigation or legal obligations can change what must be retained.
- API file limits remain: The API caps file uploads at 25 MB, and long recordings may need chunking.
8. Who should choose an offline iPhone transcription app instead?
If you found this article by searching for "ChatGPT transcription privacy," "offline transcription app iPhone," "AI voice notes no cloud," or "transcribe audio without uploading," you are probably not just looking for transcription — you are looking for control.
- Lawyers and legal teams: Attorney-client conversations should stay under your control. A local-only iPhone transcription app reduces third-party exposure. See our guide on secure offline transcription for lawyers.
- Clinicians and therapists: Patient notes and therapy recordings can include sensitive health information. Keeping audio on-device simplifies the data-flow conversation. Read more about on-device transcription for healthcare and therapy notes.
- Journalists and researchers: Source interviews are often sensitive. Offline transcription avoids creating a cloud copy of raw audio.
- Finance, HR, and compliance teams: Internal investigations, earnings prep, employee notes, and regulated conversations should minimize unnecessary processors.
- Students, creators, and travelers: Lectures, ideas, and interviews should still be transcribable on a train, plane, or in low-signal environments.
The business case for offline transcription is stronger in 2026 because AI governance is now a board-level privacy issue, not just an IT preference. IBM's 2025 Cost of a Data Breach Report puts the global average breach cost at US $4.4 million and says 97% of organizations that reported an AI-related security incident lacked proper AI access controls. For sensitive voice data — legal notes, therapy sessions, patient details, HR conversations, source interviews, financial discussions — fewer cloud hops means fewer systems, vendors, policies, and legal exceptions to worry about.
Need transcription where the recording never leaves your iPhone?
VoiceScriber works offline, supports 100+ languages, and keeps audio and transcripts local unless you choose to export or share them.
9. The private offline alternative: VoiceScriber for iPhone
If your requirement is "transcribe audio on my iPhone without uploading it", choose a tool built around that requirement from the start.
VoiceScriber is built for high-privacy iOS transcription:
- 100% offline, on-device transcription — works in airplane mode and does not send recordings or transcripts to a server.
- Private iPhone voice notes — record, transcribe, edit, search, and export from your device.
- 100+ languages supported offline.
- No account required — no sign-up barrier, no cloud sync requirement, no server-side transcript storage.
- Built for real-world privacy workflows — lawyers, clinicians, journalists, students, creators, and teams that need reliable transcription even without Wi‑Fi.
VoiceScriber is purpose-built for privacy-critical iPhone workflows. All audio and transcripts remain on your device unless you explicitly export or share them. There is no upload step, no cloud transcription queue, and no third-party transcript retention policy to interpret.
Need private transcription on iPhone — without uploading audio?
VoiceScriber works offline in 100+ languages and keeps recordings and transcripts on your device.
Download VoiceScriber on the App Store10. At-a-glance: ChatGPT transcription vs. VoiceScriber
| Factor | ChatGPT (Record Mode / API) | VoiceScriber (Offline iPhone Alternative) |
|---|---|---|
| Connectivity | Internet required | No internet needed; airplane mode OK |
| Where processing happens | OpenAI cloud | On your iPhone |
| Training usage | Consumer ChatGPT may train on transcripts/canvases unless you opt out; business tiers and API are excluded by default | No upload; recordings stay local unless you export/share |
| Data retention | API audio endpoints currently show no retention; Record Mode transcripts follow workspace retention and legal requirements | Local only until you choose to export |
| Speaker diarization | Record Mode supports multiple speakers; API offers gpt-4o-transcribe-diarize |
Best for private voice notes, interviews, and on-device transcription workflows |
| Languages | Many languages supported; Record Mode works best in English today | 100+ languages offline |
| Limits | Record Mode: 4 hours / 240 minutes; API: 25 MB upload limit | Device-bound and app-bound, with no cloud upload limit |
| Platform | Record Mode: macOS only; API: any platform for developers | iPhone / iOS |
| Best for | Cloud-friendly AI summaries, team workflows, API integrations | Private iPhone transcription, offline voice notes, no-cloud workflows |
11. If you must use ChatGPT for transcription, harden your setup
- Turn off training for consumer ChatGPT: Use OpenAI's privacy controls so future conversations are not used to improve models.
- Prefer Business, Enterprise, Edu, Healthcare, or API for sensitive workflows: These are excluded from model training by default unless your organization opts in.
- Review "Reference record history": Disable it if you do not want past recording transcripts and canvases referenced in future chats.
- Use the audio API for one-off sensitive cloud transcription: The audio transcription/translation endpoints currently have the cleanest OpenAI API retention story.
- Do not mix sensitive recordings with general prompts: Keep privileged, clinical, financial, or HR recordings separate from casual ChatGPT use.
- Choose offline for zero-upload needs: If policy says audio must not leave the device, skip cloud transcription entirely and use VoiceScriber.
12. How we tested
To write this guide, we tested ChatGPT transcription (Record Mode and the speech-to-text API) and VoiceScriber side by side across five real-world scenarios:
- Quiet English memo — a solo voice note recorded in a silent room, approximately 3 minutes.
- Noisy cafe environment — a voice recording captured in a busy coffee shop with background chatter, music, and espresso machine noise.
- Two-speaker meeting — a simulated two-person meeting to test speaker separation and overlapping speech handling.
- Accented English — recordings from speakers with non-native English accents (Turkish, German) to evaluate robustness.
- Non-English audio — clips in Turkish, Spanish, and Japanese to compare multilingual accuracy and offline language coverage.
For each scenario, we compared accuracy, latency, and whether the tool worked without an internet connection. VoiceScriber was tested in airplane mode throughout. ChatGPT required a stable Wi-Fi connection for every test.
This is not a formal benchmark — it is a practical, hands-on evaluation designed to reflect how these tools perform in the situations most readers actually face.
FAQs
Can ChatGPT transcribe audio?
Yes. ChatGPT can transcribe audio through ChatGPT Record Mode in the macOS desktop app, and OpenAI can transcribe audio files through the Speech-to-Text API. Record Mode is for live recordings and summaries. The API is for developers uploading supported audio files. Neither option is offline.
Can ChatGPT transcribe audio files?
For audio files, the official OpenAI route is the Speech-to-Text API, not normal ChatGPT document upload. The API supports formats such as MP3, MP4, MPEG, MPGA, M4A, WAV, and WEBM, with a current 25 MB upload limit. ChatGPT's public file-upload docs focus on documents, spreadsheets, presentations, images, PDFs, and text files.
What are the ChatGPT audio upload limits in 2026?
The key limit for OpenAI audio-file transcription is the 25 MB limit on Speech-to-Text API file uploads. ChatGPT's general file-upload limits, such as 512 MB per uploaded ChatGPT file, apply to ChatGPT document/file workflows and should not be confused with the Audio API's transcription upload limit.
Is ChatGPT transcription private?
It depends on which product you use. OpenAI says Record Mode audio is deleted after transcription, but transcripts and canvases follow your workspace retention settings. Consumer ChatGPT transcripts/canvases may be used for model improvement if the relevant setting is enabled, while business offerings and API data are excluded from training by default unless opted in.
Does ChatGPT transcription work offline?
No. ChatGPT Record Mode and the OpenAI speech-to-text API require internet. If you need offline transcription on iPhone, VoiceScriber works on-device and does not upload recordings.
Does OpenAI train on my transcripts?
- Consumer ChatGPT: Content may be used to improve models unless you opt out.
- Business, Enterprise, Edu, Healthcare: Excluded from training by default.
- API audio endpoints: Not used for training, with no abuse-monitoring retention for
/v1/audio/transcriptionsand/v1/audio/translations.
How long does OpenAI keep my audio data?
It depends on the product. Record Mode deletes audio after transcription; transcripts and canvases follow workspace retention and are removed within 30 days after deletion unless legal obligations require retention. The API audio endpoints (/v1/audio/transcriptions, /v1/audio/translations) currently show no training, no abuse-monitoring retention, and no application-state retention.
What is the ChatGPT Record Mode limit?
OpenAI currently lists Record Mode's recording length cap as 4 hours / 240 minutes per session. API file uploads are limited to 25 MB.
What languages does ChatGPT transcription support?
The speech-to-text API supports a wide range of languages across its models (gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-transcribe-diarize, whisper-1). Record Mode works best in English today. Accuracy varies by language and model.
Can ChatGPT Record Mode reference my past recordings?
Yes, when "reference record history" is enabled. Past recording transcripts and canvases can be referenced in later conversations. You can disable this in your settings.
Is ChatGPT Record Mode available on iPhone?
No. ChatGPT Record Mode is currently available in the macOS desktop app, not as an iPhone-native offline transcription feature. If you need private transcription directly on iPhone, use an offline iOS app like VoiceScriber.
What's the best offline alternative to ChatGPT transcription?
VoiceScriber is an offline iPhone transcription app. It works on-device in 100+ languages and does not send recordings or transcripts to a server.