Can ChatGPT transcribe audio in 2026? Yes — via ChatGPT Record Mode in the macOS desktop app and the OpenAI speech-to-text API. But the important catch is still the same: ChatGPT transcription is cloud-based, requires internet, and your transcript handling depends on whether you use consumer ChatGPT, a business workspace, or the API. If you are searching for a private iPhone transcription app that works in airplane mode, VoiceScriber is the high-privacy alternative: 100% offline, on-device, supports 100+ languages, and never sends recordings or transcripts to a server.
TL;DR
Yes, ChatGPT can transcribe audio, but it is not an offline transcription app. Record Mode is available in the ChatGPT macOS desktop app for paid/business workspaces and now supports recordings up to 4 hours / 240 minutes per session. The OpenAI speech-to-text API supports gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-transcribe-diarize, and whisper-1, with a 25 MB file upload limit. The API audio transcription and translation endpoints currently show no training, no abuse-monitoring retention, and no application-state retention. Consumer ChatGPT transcripts/canvases may be used for model improvement unless you opt out, while Business/Enterprise/Edu content is excluded by default. If your requirement is "no recording leaves my iPhone", choose VoiceScriber: private, offline, on-device transcription for iOS.
Table of contents
- Can ChatGPT transcribe audio?
- Record Mode vs API vs Realtime: what's the difference?
- What data does OpenAI delete, retain, or use for training?
- What changed in 2026?
- What are the current limits?
- ChatGPT transcription: key pros
- Where ChatGPT transcription breaks down
- Who should not use ChatGPT transcription?
- The private offline alternative: VoiceScriber
- At-a-glance comparison table
- If you must use ChatGPT, harden your setup
- How we tested
- Related articles
- FAQs
1. Can ChatGPT transcribe audio?
Yes. As of May 2026, OpenAI offers two practical ways to transcribe audio with ChatGPT or OpenAI models:
- ChatGPT Record Mode — a feature in the macOS desktop app that live-transcribes meetings, brainstorms, interviews, and voice notes, then creates editable notes or a private canvas.
- OpenAI Speech-to-Text API — a cloud API for file-based transcription and English translation using models such as
gpt-4o-transcribe,gpt-4o-mini-transcribe,gpt-4o-transcribe-diarize, andwhisper-1.
Both options are useful if you want AI summaries, cleanup, prompts, and speaker-aware transcripts. But they are not offline. Audio or transcript data is processed through OpenAI's cloud, which matters for lawyers, clinicians, journalists, finance teams, students recording sensitive lectures, and anyone who wants local-only voice notes on iPhone.
2. Record Mode vs OpenAI API vs Realtime: what's the difference?
| Feature | Record Mode | Speech-to-Text API | Realtime API |
|---|---|---|---|
| Best for | Meetings, voice notes, brainstorms, summaries | Uploading audio files for transcription | Streaming voice apps and low-latency audio |
| Platform | macOS desktop app only | Any platform via API | Any platform via WebRTC/WebSocket |
| Access | Plus, Pro, Business, Enterprise, Edu | API key with billing | API key with billing |
| Current models | Internal / not user-selectable | gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-transcribe-diarize, whisper-1 |
Realtime-specific models and transcription settings |
| Speaker diarization | Supports multiple speakers | Use gpt-4o-transcribe-diarize with diarized_json |
Depends on implementation |
| Main limit | 4 hours / 240 minutes per session | 25 MB per file upload | Depends on realtime session, model, and usage limits |
| Privacy posture | Audio deleted after transcription; transcript/canvas follows workspace retention and training settings | Audio transcription/translation endpoints currently show no training, no abuse-monitoring retention, and no application-state retention | Cloud-based; check API data controls and retention settings |
| Offline? | No | No | No |
Bottom line: ChatGPT is strong for cloud AI workflows. VoiceScriber is better when the buying intent is offline transcription app for iPhone, private voice-to-text, no cloud transcription, or airplane mode transcription.
3. What data does OpenAI delete, retain, or use for training?
This is the most important privacy question — and the answer depends on which OpenAI product you use.
Record Mode (ChatGPT macOS app)
- Audio recordings are used only for transcription and deleted afterward.
- Transcripts and canvases follow your workspace retention settings. If you delete the conversation, the associated transcript/canvas is removed from OpenAI systems within 30 days, unless OpenAI is legally required to retain it.
- Reference record history: When enabled, ChatGPT can reference past recording canvas and transcripts in future conversations. This can be useful for meeting memory, but it is a privacy consideration for sensitive topics.
- Consumer ChatGPT: If model improvement is enabled, transcripts and canvases may be used to improve OpenAI models unless you opt out.
- Business, Enterprise, and Edu: Record Mode transcripts and canvases are excluded from model training by default.
Speech-to-Text API
- The
/v1/audio/transcriptionsand/v1/audio/translationsendpoints currently show no training, no abuse-monitoring retention, and no application-state retention. - OpenAI says API inputs and outputs are not used for training by default unless an organization explicitly opts in.
- Other API endpoints can have different retention behavior, so do not assume every OpenAI endpoint has the same privacy profile as the audio transcription endpoints.
Legal holds and litigation
- Legal obligations can override ordinary deletion timelines. OpenAI previously reported a 2025 preservation order in the New York Times litigation, later saying the indefinite-retention obligation ended on September 26, 2025, while a limited historical April–September 2025 set remained stored under legal restrictions.
Key takeaway for privacy-first users: OpenAI's API audio endpoints currently have the cleanest cloud-retention story, but they are still cloud endpoints. If your requirement is "no upload, no server, no third-party retention risk", use an on-device iPhone app like VoiceScriber instead.
4. What changed in 2026?
- Record Mode's limit is now 4 hours / 240 minutes. The earlier 120-minute cap is outdated.
- Record Mode remains macOS-only. It is available for Plus, Pro, Business, Enterprise, and Edu workspaces, but it is not an iPhone transcription app.
- Reference record history is now a key privacy setting. When enabled, past recording transcripts and canvases can inform future ChatGPT responses.
- Speaker-aware API transcription is stronger.
gpt-4o-transcribe-diarizesupports speaker-labeled output throughdiarized_json, which is useful for meetings and interviews. - The API retention story is endpoint-specific. The audio transcription and translation endpoints currently show no training, no abuse-monitoring retention, and no application-state retention — but other endpoints may still retain data for abuse monitoring or application state.
- On-device ASR is improving fast. Recent on-device speech recognition research shows that local transcription can be competitive on accuracy and latency, strengthening the case for offline iPhone transcription when privacy and reliability matter.
5. What are the current limits?
| Limit | Record Mode | Speech-to-Text API |
|---|---|---|
| Session/file cap | 4 hours / 240 minutes per recording session | 25 MB per file upload |
| Platform | macOS desktop app only | Any platform via API |
| Plan required | Plus, Pro, Business, Enterprise, Edu | API key with billing |
| Speaker labels | Multiple-speaker support | gpt-4o-transcribe-diarize supports speaker-aware diarized_json |
| Language performance | Works best in English today; other languages supported but accuracy can vary | Supports many languages; transcription outputs the source language, while translation output is English-only |
| Connectivity | Internet required | Internet required |
| Best privacy fit | Cloud-friendly users who want instant summaries | Developers needing cloud transcription with endpoint-specific retention controls |
For files larger than 25 MB, OpenAI recommends splitting audio into smaller chunks or using compressed formats. If you want a tool that does not depend on API upload limits, Wi‑Fi, or server-side processing, see our airplane-mode test of 7 popular transcription tools.
6. ChatGPT transcription: key pros
- End-to-end convenience: Record, transcribe, and auto-summarize in one place; outputs can be turned into tasks, emails, or plans.
- Speaker diarization: Record Mode supports multiple speakers; the API offers
gpt-4o-transcribe-diarizefor speaker identification. - Enterprise-friendly controls: Business/Enterprise/Edu content is excluded from training by default; admins can control or disable Record Mode.
- API audio endpoints have no retention: The transcription and translation endpoints currently retain no data for training, abuse monitoring, or application state.
- Multilingual: The speech-to-text models support a wide range of languages, though accuracy varies.
7. Where ChatGPT transcription breaks down
- It is not offline: Record Mode and the API require internet. If you need to transcribe on a flight, in a hospital dead zone, in court, or during field interviews without signal, ChatGPT is the wrong fit.
- It is not iPhone-native transcription: Record Mode is macOS-only. iPhone users who want tap-to-record offline transcription need a separate iOS app.
- Record history can create context bleed: When "reference record history" is enabled, past transcripts can influence future responses. That is useful for continuity but risky for sensitive meetings.
- Consumer training settings matter: Consumer ChatGPT content may be used for model improvement unless you opt out. Many casual users never review this setting.
- Legal holds can override deletion promises: Even when a product has normal deletion windows, active litigation or legal obligations can change what must be retained.
- API file limits remain: The API caps file uploads at 25 MB, and long recordings may need chunking.
8. Who should choose an offline iPhone transcription app instead?
If you found this article by searching for "ChatGPT transcription privacy," "offline transcription app iPhone," "AI voice notes no cloud," or "transcribe audio without uploading," you are probably not just looking for transcription — you are looking for control.
- Lawyers and legal teams: Attorney-client conversations should stay under your control. A local-only iPhone transcription app reduces third-party exposure. See our guide on secure offline transcription for lawyers.
- Clinicians and therapists: Patient notes and therapy recordings can include sensitive health information. Keeping audio on-device simplifies the data-flow conversation. Read more about on-device transcription for healthcare and therapy notes.
- Journalists and researchers: Source interviews are often sensitive. Offline transcription avoids creating a cloud copy of raw audio.
- Finance, HR, and compliance teams: Internal investigations, earnings prep, employee notes, and regulated conversations should minimize unnecessary processors.
- Students, creators, and travelers: Lectures, ideas, and interviews should still be transcribable on a train, plane, or in low-signal environments.
The business case is not theoretical. IBM's 2025 Cost of a Data Breach Report puts the global average breach cost at US $4.4 million and highlights the growing risk of ungoverned AI. For sensitive voice data, fewer cloud hops means fewer places for things to go wrong.
9. The private offline alternative: VoiceScriber for iPhone
If your requirement is "transcribe audio on my iPhone without uploading it", choose a tool built around that requirement from the start.
VoiceScriber is built for high-privacy iOS transcription:
- 100% offline, on-device transcription — works in airplane mode and does not send recordings or transcripts to a server.
- Private iPhone voice notes — record, transcribe, edit, search, and export from your device.
- 100+ languages supported offline.
- No account required — no sign-up barrier, no cloud sync requirement, no server-side transcript storage.
- Built for real-world privacy workflows — lawyers, clinicians, journalists, students, creators, and teams that need reliable transcription even without Wi‑Fi.
VoiceScriber is purpose-built for privacy-critical iPhone workflows. All audio and transcripts remain on your device unless you explicitly export or share them. There is no upload step, no cloud transcription queue, and no third-party transcript retention policy to interpret.
Need private transcription on iPhone — without uploading audio?
VoiceScriber works offline in 100+ languages and keeps recordings and transcripts on your device.
Download VoiceScriber on the App Store10. At-a-glance: ChatGPT transcription vs. VoiceScriber
| Factor | ChatGPT (Record Mode / API) | VoiceScriber (Offline iPhone Alternative) |
|---|---|---|
| Connectivity | Internet required | No internet needed; airplane mode OK |
| Where processing happens | OpenAI cloud | On your iPhone |
| Training usage | Consumer ChatGPT may train on transcripts/canvases unless you opt out; business tiers and API are excluded by default | No upload; recordings stay local unless you export/share |
| Data retention | API audio endpoints currently show no retention; Record Mode transcripts follow workspace retention and legal requirements | Local only until you choose to export |
| Speaker diarization | Record Mode supports multiple speakers; API offers gpt-4o-transcribe-diarize |
Best for private voice notes, interviews, and on-device transcription workflows |
| Languages | Many languages supported; Record Mode works best in English today | 100+ languages offline |
| Limits | Record Mode: 4 hours / 240 minutes; API: 25 MB upload limit | Device-bound and app-bound, with no cloud upload limit |
| Platform | Record Mode: macOS only; API: any platform for developers | iPhone / iOS |
| Best for | Cloud-friendly AI summaries, team workflows, API integrations | Private iPhone transcription, offline voice notes, no-cloud workflows |
11. If you must use ChatGPT for transcription, harden your setup
- Turn off training for consumer ChatGPT: Use OpenAI's privacy controls so future conversations are not used to improve models.
- Prefer Business, Enterprise, Edu, Healthcare, or API for sensitive workflows: These are excluded from model training by default unless your organization opts in.
- Review "Reference record history": Disable it if you do not want past recording transcripts and canvases referenced in future chats.
- Use the audio API for one-off sensitive cloud transcription: The audio transcription/translation endpoints currently have the cleanest OpenAI API retention story.
- Do not mix sensitive recordings with general prompts: Keep privileged, clinical, financial, or HR recordings separate from casual ChatGPT use.
- Choose offline for zero-upload needs: If policy says audio must not leave the device, skip cloud transcription entirely and use VoiceScriber.
12. How we tested
To write this guide, we tested ChatGPT transcription (Record Mode and the speech-to-text API) and VoiceScriber side by side across five real-world scenarios:
- Quiet English memo — a solo voice note recorded in a silent room, approximately 3 minutes.
- Noisy cafe environment — a voice recording captured in a busy coffee shop with background chatter, music, and espresso machine noise.
- Two-speaker meeting — a simulated two-person meeting to test speaker separation and overlapping speech handling.
- Accented English — recordings from speakers with non-native English accents (Turkish, German) to evaluate robustness.
- Non-English audio — clips in Turkish, Spanish, and Japanese to compare multilingual accuracy and offline language coverage.
For each scenario, we compared accuracy, latency, and whether the tool worked without an internet connection. VoiceScriber was tested in airplane mode throughout. ChatGPT required a stable Wi-Fi connection for every test.
This is not a formal benchmark — it is a practical, hands-on evaluation designed to reflect how these tools perform in the situations most readers actually face.
FAQs
Does ChatGPT transcription work offline?
No. ChatGPT Record Mode and the OpenAI speech-to-text API require internet. If you need offline transcription on iPhone, VoiceScriber works on-device and does not upload recordings.
Does OpenAI train on my transcripts?
- Consumer ChatGPT: Content may be used to improve models unless you opt out.
- Business, Enterprise, Edu, Healthcare: Excluded from training by default.
- API audio endpoints: Not used for training, with no abuse-monitoring retention for
/v1/audio/transcriptionsand/v1/audio/translations.
How long does OpenAI keep my audio data?
It depends on the product. Record Mode deletes audio after transcription; transcripts and canvases follow workspace retention and are removed within 30 days after deletion unless legal obligations require retention. The API audio endpoints (/v1/audio/transcriptions, /v1/audio/translations) currently show no training, no abuse-monitoring retention, and no application-state retention.
What is the ChatGPT Record Mode limit?
OpenAI currently lists Record Mode's recording length cap as 4 hours / 240 minutes per session. API file uploads are limited to 25 MB.
What languages does ChatGPT transcription support?
The speech-to-text API supports a wide range of languages across its models (gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-transcribe-diarize, whisper-1). Record Mode works best in English today. Accuracy varies by language and model.
Can ChatGPT Record Mode reference my past recordings?
Yes, when "reference record history" is enabled. Past recording transcripts and canvases can be referenced in later conversations. You can disable this in your settings.
Is ChatGPT Record Mode available on iPhone?
No. ChatGPT Record Mode is currently available in the macOS desktop app, not as an iPhone-native offline transcription feature. If you need private transcription directly on iPhone, use an offline iOS app like VoiceScriber.
What's the best offline alternative to ChatGPT transcription?
VoiceScriber is an offline iPhone transcription app. It works on-device in 100+ languages and does not send recordings or transcripts to a server.