I want a progressive web app that transcribes my voice using Whisper API and lets me copy the transcription to the clipboard.
The app should be very simple - just (from top to bottom) a logo.png, big red record/stop button, a text field (where transcription appears) and a big `copy to clipboard` button.
You click to record, click again to stop, it's sent to Whisper API and then transcribed, then appears in text field to copy.
The design should be done very much with phones as in mind as that's where I will be using it.
It would be a nice effect if when recording the button changes color and throbs.
It would also be a nice effect if the background was a light gradient - from very light purple to very light pink.
Style could maybe have some neon colors.
This can be a ONE PAGE APP. We do NOT need to save to firebase so NO NEED for firebase config.
This is a mobile first app so will definitely need manifest.json, even if we cant send to whisper api when offline.
# Voice 2 Clip
A progressive web application that enables quick voice-to-text transcription using OpenAI's Whisper API, with a simple, mobile-first interface for recording and copying transcribed text.
## Project Overview
Voice 2 Clip is designed for seamless voice transcription on mobile devices. The app features a minimalist interface with:
- Logo display
- Recording button with visual feedback
- Transcription text field
- Copy to clipboard functionality
The app employs a modern, gradient-based design with neon accent colors and responsive animations, making it both functional and visually appealing on mobile devices.
## Features
- One-click voice recording
- Real-time recording feedback with animated button
- OpenAI Whisper API integration for accurate transcription
- Clipboard integration for easy text copying
- Mobile-first responsive design
- Progressive Web App capabilities
- Elegant UI with gradient background and neon accents
## Prerequisites & Installation
1. Clone the repository:
```bash
git clone https://github.com/yourusername/voice-2-clip.git
cd voice-2-clip
```
2. Install dependencies:
```bash
npm install react-router-dom openai
```
3. Create a `.env` file in the root directory:
```
REACT_APP_OPENAI_API_KEY=your_openai_api_key
```
## Running the App
Start the development server:
```bash
npm start
```
Access the app at `http://localhost:3000`
## Deployment Notes
1. Build the production version:
```bash
npm run build
```
2. The app is PWA-ready with:
- Configured `manifest.json`
- Service Worker for offline functionality
- Appropriate icons and assets
## Environment Variables
Required environment variables:
- `REACT_APP_OPENAI_API_KEY`: Your OpenAI API key for Whisper API access
Create a `.env` file in the project root and add these variables. The Create React App environment will automatically load them.
## API Integration
The app integrates with OpenAI's Whisper API for voice transcription:
- API calls are handled in `src/services/whisperApi.js`
- Transcription requests are made using the OpenAI SDK
- Audio is recorded in WebM format and sent to the API
- Responses are processed and displayed in the text field
## Project Structure
```
voice-2-clip/
├── src/
│ ├── components/
│ │ ├── AudioRecorder/
│ │ ├── Logo/
│ │ ├── TranscriptionField/
│ │ └── CopyButton/
│ ├── pages/
│ │ ├── HomePage/
│ │ └── ErrorPage/
│ ├── services/
│ │ └── whisperApi.js
│ └── App.js
├── public/
│ ├── manifest.json
│ └── serviceWorker.js
└── package.json
```
## Mobile Support
The app is optimized for mobile devices with:
- Responsive layouts
- Touch-friendly interface
- PWA installation support
- Offline functionality (excluding API calls)
## Troubleshooting
Common issues:
- Microphone access denied: Enable microphone permissions in browser settings
- API key errors: Verify your OpenAI API key in `.env`
- PWA not installing: Ensure HTTPS is enabled on your hosting
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License.
SESSION 9 - DEBUG SUMMARY
Here's a comprehensive summary of the Voice 2 Clip app:
1. WHAT THE APP DOES
Voice 2 Clip is a Progressive Web App (PWA) designed primarily for mobile use that enables users to record voice messages and get them transcribed to text using OpenAI's Whisper API. The app features a minimal, mobile-first interface with a gradient background and neon-style accents.
The core functionality revolves around a central record button that pulses when active. Users can start/stop recording with a single tap, after which the audio is automatically sent to the Whisper API for transcription. The resulting text appears in a text field below, and users can copy it to their clipboard with a single tap.
From a technical perspective, the app utilizes React.js for the frontend, with vanilla CSS for styling. It implements the MediaRecorder API for audio capture, converts the audio to a suitable format for the Whisper API, and handles the API communication through a dedicated service. The app includes PWA features through manifest.json and supports offline functionality (though transcription requires an internet connection).
2. FILE STRUCTURE AND DEPENDENCIES
CORE FILES:
- src/App.js - Main application container that orchestrates the core components and routing. Interacts with all component files and depends on HomePage and ErrorPage. Critical file that defines the app's structure.
- src/services/whisperApi.js - Handles all communication with OpenAI's Whisper API. Used by AudioRecorder component. Critical for the app's main functionality. Contains API configuration and transcription logic.
- src/components/AudioRecorder.js - Manages audio recording functionality using MediaRecorder API. Depends on whisperApi.js. Critical component that handles the main user interaction.
COMPONENT FILES:
- src/components/RecordButton.js + CSS - Handles the recording button UI and animation. Used by AudioRecorder.
- src/components/TranscriptionField.js + CSS - Displays transcription results. Used by HomePage.
- src/components/CopyButton.js + CSS - Manages clipboard functionality. Used by HomePage.
- src/components/Logo.js + CSS - Displays app logo. Used by HomePage.
PAGE FILES:
- src/pages/HomePage.js + CSS - Main page container orchestrating all components.
- src/pages/ErrorPage.js + CSS - 404 error handling page.
STYLING:
- src/App.css - Global styles and theme definitions.
3. FRONTEND STRUCTURE
Main Page (HomePage):
- Contains logo at top
- Central record button with pulse animation when recording
- Transcription text field below
- Copy to clipboard button at bottom
- Files: HomePage.js, all component files
Error Page:
- 404 error handling
- "Back to Home" navigation
- Files: ErrorPage.js
4. RECOMMENDED INITIAL FILES TO SHARE
When starting a new chat about editing this app, I recommend sharing these files in this order:
1. src/App.js
2. src/services/whisperApi.js
3. src/components/AudioRecorder.js
4. src/App.css
5. src/pages/HomePage.js
6. src/components/RecordButton.js
These six files give a complete picture of the core functionality, API integration, and main UI components, allowing me to understand any potential changes needed. This covers the recording mechanism, API integration, and main UI structure, which are the most likely areas for modifications.